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FIELD OF THE INVENTION 

The present invention relates to the field of adaptive systems, and more particularly 
systems and methods which are adaptive to a human user input and/or a data environment, as 
well as applications for such systems and methods. More particularly, embodiments of the 
invention involve, for example, consumer electronics, personal computers, control systems, and 
professional assistance systems. 

BACKGROUND OF THE INVENTION 

The prior art is rich in various systems and methods for data analysis, as well as various 
systems and methods relating to useful endeavors. In general, most existing systems and 
methods provide concrete functions, which have a defined response to a defined stimulus. Such 
systems, while embodying the "wisdom" of the designer, have a particular shortcoming in that 
their capabilities, user interface and functionality are static. 
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Intelligent or learning systems are also known. These systems are typically limited by the 
particular paradigm employed, and rarely are the learning algorithms general enough to be 
applied without limitation to other fields. In fact, while the generic theory and systems which 
learn are well known, the application of such systems to particular problems often requires both a 
detailed description of the problem, as well as knowledge of the input and output spaces. Even 
once these factors are known, a substantial tuning effort may be necessary to enable acceptable 
operation. 

Therefore, the present invention builds upon the prior art, which defines various problems 
to be addressed, intelligent systems and methods, tuning paradigms and user interfaces. 
Therefore, as set forth below, and in the attached appendix of references and abstracts, 
incorporated herein by reference, a significant number of references detail fundamental 
technologies which may be improved according to the present invention, or incorporated together 
to form a part of the present invention. Thus, the complete disclosure of these references, 
combined with the disclosure herein, and/or with each other, are a part of the present invention. 
The disclosure herein is not meant to be limiting as to the knowledge of a person of ordinary skill 
in the art. Thus, prior art cited herein is intended to (1) disclose information related to the 
application published before the filing or effective filing date hereof; (2) define the problem in 
the art to which the present invention is directed, (3) define prior art methods of solving various 
problems also addressed by the present invention; (4) define the state of the art with respect to 
methods disclosed or referenced herein; (5) detail technologies used to implement methods or 
apparatus in accordance with the present invention; and/or (6) define elements of the invention as 
disclosed in individual references, combinations of references, and/or combinations of disclosure 
of the references with the express disclosure herein. 

HUMAN INTERFACE 

Aspects of the present invention provide an advanced user interface. The subject of man- 
machine interfaces has been studied for many years, and indeed the entire field of ergonomics 
and human factors engineering revolves around optimization of human-machine interfaces. 
Typically, the optimization scheme optimizes the mechanical elements of a design, or seeks to 
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provide a universally optimized interface. Thus, a single user interface is typically provided for a 
system, although some systems have multiple different interfaces which may be related or 
unrelated. In fact, some systems provide a variety of related interfaces, for example, novice, 
intermediate and advanced, to provide differing balances between available control and presented 
complexity. Further, adaptive and/or responsive human-machine computer interfaces are now 
well known. However, a typical problem presented is to define a self-consistent and useful (i.e., 
an improvement over a well-designed static interface) theory for altering the interface. 
Therefore, even where, in a given application, a theory for optimization exists, the theory is 
typically not generalizable to other applications. Therefore, one aspect of the present invention is 
to provide such an overall theory by which adaptive and/or responsive user interfaces may be 
constructed and deployed. 

In a particular application, the user interface according to the present invention may be 
applied to general-purpose-type computer systems, for example, personal computers. While it 
might seem that a general-purpose-type computer system interface would necessarily be general 
purpose, and therefore not require modification for the many potential uses, this is not the case. 
In fact, the lack of application specificity may make such an interface difficult to use, decreasing 
efficiency of use and increasing user frustration and the probability of error. One aspect of the 
present invention thus relates to a programmable device that comprises a menu-driven interface 
in which the user enters information using a direct manipulation input device. An earlier type of 
interface scheme addressing this issue is disclosed in Verplank, William L., "Graphics in 
Human-Computer Communication: Principles of Graphical User-Interface Design", Xerox 
Office Systems. See the references cited therein: Foley, J.D., Wallace, V.L., Chan, P., "The 
Human Factor of Computer Graphics Interaction Techniques", IEEE CG&A, Nov. 1984, pp. 13- 
48; Koch, H., "Ergonomische Betrachtung von Schreibtastaturen", Humane Production, 1, pp. 
12-15 (1985); Norman, D.A., Fisher, D., "Why Alphabetic Keyboards Are Not Easy To Use: 
Keyboard Layout Doesn't Much Matter", Human Factors 24(5), pp. 509-519 (1982); 
Perspectives: High Technology 2, 1985; Knowlton, K., "Virtual Pushbuttons as a Means of 
Person-Machine Interaction", Proc. of Conf. Computer Graphics, Pattern Recognition and Data 
Structure, Beverly Hills, California, May 1975, pp. 350-352; "Machine Now Reads, enters 
Information 25 Times Faster Than Human Keyboard Operators", Information Display 9. p. 18 
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(1981); "Scanner Converts Materials to Electronic Files for PCs", IEEE CG&A, Dec. 1984, p. 
76; "New Beetle Cursor Director Escapes All Surface Constraints", Information Display 10, p. 
12, 1984; Lu, C, "Computer Pointing Devices: Living With Mice", High Technology, Jan. 1984, 
pp. 61-65; "Finger Painting", Information Display 12, p. 18, 1981; Kraiss, K.F., "Neuere 
Methoden der Interaktion an der Schnittstelle Mensch-Maschine", Z.F. Arbeitswissenschaft, 2, 
pp. 65-70, 1978; Hirzinger, G., Landzettel, K., "Sensory Feedback Structures for Robots with 
Supervised Learning", IEEE Conf. on Robotics and Automation, St. Louis, March 1985; Horgan, 
H., "Medical Electronics", IEEE Spectrum, Jan. 1984, pp. 90-93. 

A menu based remote control-contained display device is disclosed in Platte, Oberjatzas, 
and Voessing, "A New Intelligent Remote Control Unit for Consumer Electronic Device", IEEE 
Transactions on Consumer Electronics, Vol. CE-31, No. 1, February 1985, 59-68, 

It is noted that in text-based applications, an input device that is accessible, without the 
necessity of moving the user's hands from the keyboard, may be preferred. Known manual input 
devices include the trackball, mouse, and joystick. In addition, other devices are known, 
including the so-called "J-cursor" or "mousekey" which embeds a two (x,y) or three (x,y,p) axis 
pressure sensor in a button conformed to a finger, present in a general purpose keyboard; a 
keyboard joystick of the type described in Electronic Engineering Times, October 28, 1991, p. 
62, "IBM Points a New Way"; a so-called "isobar" which provides a two axis input by optical 
sensors (0, x), a two and one half axis (x, y, digital input) input device, such as a mouse or a 
"felix" device, infrared, acoustic, etc.; position sensors for determining the position of a finger or 
pointer on a display screen (touch-screen input) or on a touch surface, e.g., "GlidePoint" 
(ALPS/Cirque); goniometer input (angle position, such as human joint position detector), etc. 
Many of such suitable devices are summarized in Kraiss, K. F., "Alternative Input Devices For 
Human Computer Interaction", Forschunginstitut Fur Anthropotecahnik, Werthhoven, F.R. 
Germany. Another device, which may also be suitable is the GyroPoint, available from Gyration 
Inc., which provides 2-D or 3-D input information in up to six axes of motion: height, length, 
depth, roll, pitch and yaw. Such a device may be useful to assist a user in inputting a complex 
description of an object, by providing substantially more degrees of freedom sensing than 
minimally required by a standard graphic user interface. The many degrees of freedom available 
thus provide suitable input for various types of systems, such as "Virtual Reality" or which track 
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a moving object, where many degrees of freedom and a high degree of input accuracy is required. 
The Hallpot, a device which pivots a magnet about a Hall effect sensor to produce angular 
orientation information, a pair of which may be used to provide information about two axes of 
displacement, available from Elweco, Inc, Willoughby, OH, may also be employed as an input 
device. 

User input devices may be broken down into a number of categories: direct inputs, i.e. 
touch-screen and light pen; indirect inputs, i.e. trackball, joystick, mouse, touch-tablet, bar code 
scanner (see, e.g., Atkinson, Terry, "VCR Programming: Making Life Easier Using Bar Codes"), 
keyboard, and multi-function keys; and interactive input, i.e. Voice activation/instructions (see, 
e.g., Rosch, Winn L., "Voice Recognition: Understanding the Master's Voice", PC Magazine, 
October 27, 1987, 261-308); and eye tracker and data suit/data glove (see, e.g. Tello, Ernest R., 
"Between Man And Machine", Byte, September 1988, 288-293; products of EXOS, Inc; Data 
Glove). Each of the aforementioned input devices has advantages and disadvantages, which are 
known in the art. 

Studies suggest that a "direct manipulation" style of interface has advantages for menu 
selection tasks. This type of interface provides visual objects on a display screen, which can be 
manipulated by "pointing" and "clicking" on them. For example, the popular Graphical User 
Interfaces ("GUIs"), such as Macintosh and Microsoft Windows, and others known in the art, use 
a direct manipulation style interface. A device such as a touch-screen, with a more natural 
selection technique, is technically preferable to the direct manipulation method. However, the 
accuracy limitations and relatively high cost make other inputs more commercially practical. 
Further, for extended interactive use, touchscreens are not a panacea for office productivity 
applications. In addition, the user must be within arms' length of the touch-screen display. In a 
cursor positioning task, Albert (1982) found the trackball to be the most accurate pointing device 
and the touch-screen to be the least accurate when compared with other input devices such as the 
light pen, joystick, data tablet, trackball, and keyboard. Epps (1986) found both the mouse and 
trackball to be somewhat faster than both the touch-pad and joystick, but he concluded that there 
were no significant performance differences between the mouse and trackball as compared with 
the touch-pad and joystick. 
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A particular focus of the present invention is the application of the principles herein to 
consumer electronic devices and simple controls. The videocassette recorder (VCR) device 
exemplifies many of the issues presented. There have been many proposals and implementations 
seeking to improve the operation of the VCR control system. For example, a directional or direct 
manipulation-type sensor based infrared remote control is disclosed in Zeisel, Tomas, 
Tomaszewski, "An Interactive Menu-Driven Remote Control Unit for TV-Receivers and 
VC-Recorders", IEEE Transactions on Consumer Electronics, Vol. 34, No. 3, 814-818 (1988), 
which relates to a control for programming with the West German Videotext system. This 
implementation differs from the Videotext programming system than described in Bensch, U., 
"VPV - VIDEOTEXT PROGRAMS VIDEORECORDER", IEEE Transactions on Consumer 
Electronics, Vol. 34, No. 3, 788-792 (1988), which describes the system of Video Program 
System Signal Transmitters, in which the VCR is programmed by entering a code for the Video 
Program System signal, which is emitted by television stations in West Germany. Each separate 
program has a unique identifier code, transmitted at the beginning of the program, so that a user 
need only enter the code for the program, and the VCR will monitor the channel for the code 
transmission, and begin recording when the code is received, regardless of schedule changes. 
The Videotext Programs Recorder (VPV) disclosed does not intelligently interpret the 
transmission, rather the system reads the transmitted code as a literal label, without any analysis 
or determination of a classification of the program type. 

The following references are also relevant to the interface aspects of the present 
invention: 

Hoffberg, Linda I, "AN IMPROVED HUMAN FACTORED INTERFACE FOR 
PROGRAMMABLE DEVICES: A CASE STUDY OF THE VCR" Master's Thesis, Tufts 
University (Master of Sciences in Engineering Design, November, 1990). 

"Bar Code Programs VCR", Design News, February 1, 1988, 26. 

"How to find the best value in VCRs", Consumer Reports, March 1988, 135-141. 

"Low-Cost VCRs: More For Less", Consumer Reports, March 1990, 168-172. 

"Nielsen Views VCRs", Television Digest, June 23, 1988, 15. 

"The Highs and Lows of Nielsen Homevideo Index", Marketing & Media Decisions, 
November 1985,84-86+. 

"The Quest for 'User Friendly'", U.S. News & World Report, June 13, 1988. 54-56. 

"The Smart House: Human Factors in Home Automation", Human Factors in Practice, 
Dec. 1990, 1-36. 

"VCR, Camcorder Trends", Television Digest, Vol. 29:16 (March 20, 1989). 
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"VCR's: A Look At The Top Of The Line", Consumer Reports, March 1989, 167-1 70, 
"VHS Videocassette Recorders", Consumer Guide, 1990, 17-20. 

Abedini, Kamran, "An Ergonomically-improved Remote Control Unit Design", Interface 
'87 Proceedings, 375-380. 

Abedini, Kamran, and Hadad, George, "Guidelines For Designing Better VCRs", Report 
No. IME 462, February 4, 1987. 

Bensch, U., "VPV - VIDEOTEXT PROGRAMS VIDEORECORDER", IEEE 
Transactions on Consumer Electronics, 34 (3): 788-792. 

Berger, Ivan, "Secrets of the Universals", Video, February 1989, 45-47+. 

Beringer, D.B., "A Comparative Evaluation of Calculator Watch Data Entry 
Technologies: Keyboards to Chalkboards", Applied Ergonomics, December 1985, 275-278. 

Bier, E. A. et al. "MMM: A User Interface Architecture for Shared Editors on a Single 
Screen," Proceedings of the ACM Symposium on User Interface Software and Technology, Nov. 
11-13, 1991, p. 79. 

Bishop, Edward W., and Guinness, G. Victor Jr., "Human Factors Interaction with 
Industrial Design", Human Factors, 8(4):279-289 (August 1966). 

Brown, Edward, "Human Factors Concepts For Management", Proceedings of the Human 
Factors Society, 1973, 372-375. 

Bulkeley, Debra, "The Smartest House in America", Design News, October 19, 1987, 

56-61. 

Card, Stuart K., "A Method for Calculating Performance times for Users of Interactive 
Computing Systems", IEEE, 1979, 653-658. 

Carlson, Mark A., "Design Goals for an Effective User Interface", Electro/82 
Proceedings, 3/1/1-3/1/4. 

Carlson, Mark A., "Design Goals for an Effective User Interface", Human Interfacing 
with Instruments, Session 3. 

Carroll, Paul B., "High Tech Gear Draws Cries of "Uncle", Wall Street Journal, April 27, 
1988, 29. 

Cobb, Nathan, "I don't get it", Boston Sunday Globe Magazine, March 25, 1990, 23-29. 

Davis, Fred, "The Great Look-and-Feel Debate", A+, 5:9-11 (July 1987). 

Dehning, Waltraud, Essig Heidrun, and Maass, Susanne, The Adaptation of Virtual 
Man-Computer Interfaces to User Requirements in Dialogs, Germany: Springer- Verlag, 1981. 

Ehrenreich, S.L., "Computer Abbreviations - Evidence and Synthesis", Human Factors, 
27(2):143-155 (April 1985). 

Friedman, M.B., "An Eye Gaze Controlled Keyboard", Proceedings of the 2nd 
International Conference on Rehabilitation Engineering, 1984, 446-447. 

Gilfoil, D M and Mauro, C.L., "Integrating Human Factors and Design: Matching Human 
Factors Methods up to Product Development", C.L. Mauro Assoc., Inc., 1-7. 

Gould, John D., Boies, Stephen J., Meluson, Antonia, Rasammy, Marwan, and Vosburgh, 
Ann Marie, "Entry and Selection Methods For Specifying Dates". Human Factors, 32(2): 199-214 
(April 1989). 

Green, Lee, "Thermo Tech: Here's a common sense guide to the new thinking 
thermostats", Popular Mechanics, October 1985, 155-159. 

Grudin, Jonathan, "The Case Against User Interface Consistency", MCC Technical 
Report Number ACA-HI-002-89, January 1989. 



Hoffberg et al. 



-8- 



LIH-13 



• 



Harvey, Michael G., and Rothe, James T., " VideoCassette Recorders: Their Impact on 
Viewers and Advertisers", Journal of Advertising, 25:19-29 (December/January 1985). 

Hawkins, William J., "Super Remotes", Popular Science, February 1989, 76-77. 

Henke, Lucy L., and Donohue, Thomas R., "Functional Displacement of Traditional TV 
5 Viewing by VCR Owners", Journal of Advertising Research, 29:18-24 (April-May 1989). 

Hoban, Phoebe, "Stacking the Decks", New York, February 16, 1987, 20:14. 

Howard, Bill, "Point and Shoot Devices", PC Magazine, 6:95-97 (August 1987). 

Jane Pauley Special, NBC TV News Transcript, July 17, 1990, 10:00 PM. 

Kolson, Ann, "Computer wimps drown in a raging sea of technology", The Hartford 
10 Courant, May 24, 1989, BL 

Kreifeldt, J.G., "A Methodology For Consumer Product Safety Analysis", The 3rd 
National Symposium on Human Factors in Industrial Design in Consumer Products, August 
1982, 175-184. 

Kreifeldt, John, "Human Factors Approach to Medical Instrument Design", Electro/82 
1 5 Proceedings, 3/3/1 -3/3/6. 

Kuocheng, Andy Poing, and Ellingstad, Vernon S., "Touch Tablet and Touch Input", 
Interface '87, 327. 

IP; Ledgard, Henry, Singer, Andrew, and Whiteside, John, Directions in Human Factors for 

tg Interactive Systems, New York, Springer- Verlag, 1981. 

"20 Lee, Eric, and MacGregor, James, "Minimizing User Search Time Menu Retrieval 

O Systems", Human Factors, 27(2): 157-162 (April 1986). 

^ Leon, Carol Boyd, "Selling Through the VCR", American Demographics, December 

H: 1987, 40-43. 

Long, John, "The Effect of Display Format on the Direct Entry of Numerical Information 
j|5 by Pointing", Human Factors, 26(1):3-17 (February 1984). 

p Mantei, Marilyn M., and Teorey, Toby J., "Cost/Benefit Analysis for Incorporating 

FU Human Factors in the Software Lifecycle", Association for Computing Machinery, 1988. 
D Meads, Jon A., "Friendly or Frivolous", Datamation, April L 1988, 98-100. 

^ Moore, T.G. and Dartnall, "Human Factors of a Microelectronic Product: The Central 

30 Heating Timer/Programmer", Applied Ergonomics, 1983, 13(l):15-23. 

Norman, Donald A., "Infuriating By Design", Psychology Today, 22(3):52-56 (March 

1988). 

Norman, Donald A., The Psychology of Everyday Things, New York, Basic Book, Inc. 

1988. 

35 Platte, Hans-Joachim, Oberjatzas, Gunter, and Voessing, Walter, "A New Intelligent 

Remote Control Unit for Consumer Electronic Device", IEEE Transactions on Consumer 
Electronics, Vol. CE-31(l):59-68 (February 1985). 

Rogus, John G. and Armstrong, Richard, "Use of Human Engineering Standards in 
Design", Human Factors, l9(l):15-23 (February 1977). 
40 Rosch, Winn L., "Voice Recognition: Understanding the Master's Voice", PC Magazine, 

October 27, 1987, 261-308. 

Sarver, Carleton, "A Perfect Friendship", High Fidelity, 39:42-49 (May 1989). 

Schmitt, Lee, "Let's Discuss Programmable Controllers", Modern Machine Shop, May 
1987, 90-99. 
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Schniederman, Ben, Designing the User Interface: Strategies for Effective 
Human-Computer Interaction, Reading, MA, Addison- Wesley, 1987. 

Smith, Sidney J., and Mosier, Jane N., Guidelines for Designing User Interface Software, 
Bedford, MA, MITRE, 1986. 
5 Sperling, Barbara Bied, Tullis Thomas S., "Are You a Better 'Mouser 1 or 'Trackballer'? A 

Comparison of Cursor - Positioning Performance", An Interactive/Poster Session at the 
CHI+GI'87 Graphics Interface and Human Factors in Computing Systems Conference. 

Streeter, L.A., Ackroff, J.M., and Taylor, G.A. "On Abbreviating Command Names", The 
Bell System Technical Journal, 62(6):1807-1826 (July/August 1983). 
10 Swanson, David, and Klopfenstein, Bruce, "How to Forecast VCR Penetration", 

American Demographic, December 1987, 44-45. 

Tello, Ernest R., "Between Man And Machine", Byte, September 1988, 288-293. 
Thomas, John, C, and Schneider, Michael L., Human Factors in Computer Systems, New 
Jersey, Ablex Publ. Co., 1984. 
15 Trachtenberg, Jeffrey A., "How do we confuse thee? Let us count the ways", Forbes, 

March 21, 1988, 159-160. 
Jjij Tyldesley, D.A., "Employing Usability Engineering in the Development of Office 

J Products", The Computer Journal", 31(5):431-436 (1988). 

yp Verplank, William L., "Graphics in Human-Computer Communication: Principles of 

~~W Graphical User-Interface Design", Xerox Office Systems. 

C Voyt, Carlton F., "PLC's Learn New Languages", Design News, January 2, 1989, 78. 

Whitefield, A. "Human Factors Aspects of Pointing as an Input Technique in Interactive 
Computer Systems", Applied Ergonomics, June 1986, 97-104. 
g Wiedenbeck, Susan, Lambert, Robin, and Scholtz, Jean, "Using Protocol Analysis to 

f|5 Study the User Interface", Bulletin of the American Society for Information Science, June/July 
O 1989, 25-26. 

FU Wilke, William, "Easy Operation of Instruments by Both Man and Machine". Electro/82 

O Proceedings, 3/2/1-3/2/4. 

u Yoder, Stephen Kreider, "U.S. Inventors Thrive at Electronics Show", The Wall Street 

30 Journal, January 10, 1990, Bl. 

Zeisel, Gunter, Tomas, Philippe, Tomaszewski, Peter, "An Interactive Menu-Driven 
Remote Control Unit for TV-Receivers and VC-Recorders", IEEE Transactions on Consumer 
Electronics, 34(3):814-818. 

35 AGENT TECHNOLOGIES 

Presently well known human computer interfaces include so-called agent technology, in 
which the computer interface learns a task defined (inherently or explicitly) by the user and 
subsequently executes the task or negotiates with other systems to achieve the results desired by 
the user. The user task may be defined explicitly, by defining a set of rules to be followed, or 

40 implicitly, by observation of the user during completion of the specified task, and generalizing to 
a generalized construct or "agent". Such systems are available from Firefly (www.firefly.com), 
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and are commercially present in some on-line commerce systems, such as Amazon.com 

(www.amazon.com). There is some debate in the art as to what constitutes an "agent". Herein, 

such "agent" technology shall be interpreted to encompass any automated method or system 

which embodies decision-making capability defined by or derived from the user, and which may 

vary between different users. See: 

"ABI WHAP, Web Hypertext Applications Processor," 
http://alphabase.eom/abi3/whapinfo.htmI#profiling, (1996, Jul. 11). 

"AdForce Feature Set", http://www.imgis.com/index.html/core/p2-2html (1997, Apr. 

11). 

"IPRO," http://www.ipro.com/, Internet profiles Corporation Home and other Web Pages 
(1996, Jul. 11). 

"Media Planning is Redefined in a New Era of Online Advertising," PR Newswire, 
(1996, Feb. 5). 

"My Yahoo! news summary for My Yahoo! Quotes", http://my.yahoo.com, (1997, Jan. 

27). 

"NetGravity Announces Adserver 2.1", 
http://www.netgravity.com/news/pressrel/launch21.html (1997, Apr. 11). 

"Netscape & NetGravity: Any Questions?", http://www.netgravity.com/, (1996, Jul. 11). 

"Network Site Main", http://www.doubleclick.net/frames/general/nets2set.htm (1997, 
Apr. 11). 

"Real Media," http://www.realmedia.com/index.html, (1996, Jul. 11). 
"The Front Page", http://live.excite.com/7aBb (1997, Jan. 27) and (1997, Apr. 11). 
"The PointCast Network," http:/www. pointcast.com/, (1996, Spring). 
"The Power of PenPoint", Can et al., 1991, p. 39, Chapter 13, pp. 258-260. 
"Welcome to Lycos," http://www.lycos.com, (1997, Jan. 27). 
Abatemarco, Fred, "From the Editor", Popular Science, Sep. 1992, p. 4 
Berniker, M., "Nielsen plans Internet Service," Broadcasting & Cable, 125(30):34 (1995, 
Jul. 24). 

Berry, Deanne, et al. In an Apr. 10, 1990 news release, Symantec announced a new 
version of MORE (TM). 

Betts, M., "Sentry cuts access to naughty bits," Computers and Security, vol. 14, No. 7, p. 
615 (1995). 

Boy, Guy A., Intelligent Assistant Systems, Harcourt Brace Jovanovich, 1991, uses the 
term "Intelligent Assistant Systems". 

Bussey, H.E., et al., "Service Architecture, Prototype Description, and Network 
Implications of a Personalized Information Grazing Service," IEEE Multiple Facets of 
Integration Conference Proceedings, vol. 3, No. Conf. 9, Jun. 3, 1990, pp. 1046-1053. 

Donnelley, J.E., "WWW media distribution via Hopewise Reliabe Multicast," Computer 
Networks and ISDN Systems, vol. 27, No. 6, pp. 81-788 (Apr., 1995). 

Edwards, John R., "Q&A: Integrated Software with Macros and an Intelligent Assistant", 
Byte Magazine, Jan. 1986, vol. 11, Issue 1, pp. 120-122, critiques the Intelligent Assistant by 
Symantec Corporation. 
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Elofson, G. and Konsynski, B,, "Delegation Technologies: Environmental Scanning with 
Intelligent Agents", Journal of Management Information Systems, Summer 1991, vol. 8, Issue 1, 
pp. 37-62. 

Garretson, R., "IBM Adds 'Drawing Assistant' Design Tool to Graphics Series", PC 
5 Week, Aug. 13, 1985, vol. 2, Issue 32, p. 8. 

Gessler, S. and Kotulla A., "PDAs as mobile WWW browsers," Computer Networks and 
ISDN Systems, vol. 28, No. 1-2, pp. 53-59 (Dec. 1995). 

Glinert-Stevens, Susan, "Microsoft Publisher: Desktop Wizardry", PC Sources, Feb., 
1992, vol. 3, Issue 2, p. 357. 
10 Goldberg, Cheryl, "IBM Drawing Assistant: Graphics for the EGA", PC Magazine, Dec. 

24, 1985, vol. 4, Issue 26, p. 255. 

Hendrix, Gary G. and Walter, Brett A., "The Intelligent Assistant: Technical 
Considerations Involved in Designing Q&A's Natural-language Interface", Byte Magazine, Dec. 
1987, vol. 12, Issue 14, p. 251. 
15 Hoffman, D.L. et al., "A New Marketing Paradigm for Electronic Commerce," (1996, 

^ Feb. 19), http://www2000.ogsm.vanderbilt.edu novak/new. marketing.paradigm.html. 
^! Information describing BroadVision One-to-One Application System: "Overview," p. 1; 

jp Further Resources on One-To-One Marketing, p. 1; BroadVision Unleashes the Power of the 
yh Internet with Personalized Marketing and Selling, pp. 1-3; Frequently Asked Questions, pp. 1-3; 

Products, p. 1; BroadVision One-To-One(.TM.), pp. 1-2; Dynamic Command Center, p. 1: 
p Architecture that Scales, pp. 1-2; Technology, pp. 1; Creating a New Medium for Marketing and 

Selling BroadVision One-To-One and the World Wide Web a White Paper, pp. 1-15; 
f http://www.broadvision.com (1996, Jan. -Mar.). 

- Jones, R., "Digital's World-Wide Web server: A case study," Computer Networks and 

f|5 ISDN Systems, vol. 27, No. 2, pp. 297-306 (Nov. 1994). 

C McFadden, M., "The Web and the Cookie Monster," Digital Age, (1996, Aug.). 

fij Nadoli, Gajanana and Biegel, John, "Intelligent Agents in the Simulation of 

O Manufacturing Systems", Proceedings of the SCS Multiconference on AI and Simulation, 1989. 
^ Nilsson, B.A., "Microsoft Publisher is an Honorable Start for DTP Beginners", Computer 

30 Shopper, Feb. 1992, vol. 12, Issue 2, p. 426, evaluates Microsoft Publisher and Page Wizard. 

O'Connor, Rory J., "Apple Banking on Newton's Brain", San Jose Mercury News, 

Wednesday, Apr. 22, 1992. 

Ohsawa, I. and Yonezawa, A., "A Computational Model of an Intelligent Agent Who 

Talks with a Person", Research Reports on Information Sciences, Series C, Apr. 1989, No. 92, 
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Jeffrey M. Bradshaw, Kenneth M. Ford, Jack R. Adams-Webber, John H. Boose (1993) 
Beyond the Repertory Grid: New Approaches to Constructivist Knowledge Acquisition Tool 



Hoffberg et al. 



- 13 - 



LIH-13 




Development. In K. M. Ford & J. M. Bradshaw (Ed.) Knowledge Acquisition as Modeling. 
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Maglio, Paul P. and Rob Barrett (1997) How to Build Modeling Agents to Support Web 
Searchers. In Proceedings of the Sixth International Conference on User Modeling. 
http://www.es. mu.oz.au/agentlab/VL/ps/MaglioP.ps 

Marchiori, Massimo (1996) The quest for correct information on the Web: hyper search 
engines. In Hyper Proceeding of the Sixth International World Wide Web Conference. 
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Capturing a Conceptual Model for End-User Programming: Task Ontology as a Static User 
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INDUSTRIAL CONTROLS 

Industrial control systems are well known. Typically, a dedicated reliable hardware 
module controls a task using a conventional algorithm, with a low level user interface. These 
devices are programmable, and therfore a high level software program may be provided to 
translate user instructions into the low level commands, and to analyze any return data. See, U.S. 
Patent No. 5,506,768, expressly incoporated herein by reference. See, also: 

A. B. Corripio, "Tuning of Industrial Control Systems", Instrument Society of America, 
Research Triangle Park, NC (1990) pp. 65-81. 

C. J. Harris & S. A. Billings, "Self-Tuning and Adaptive Control: Theory and 
Applications", Peter Peregrinus LTD (1981) pp. 20-33. 

C. Rohrer & Clay Nesler, "Self -Tuning Using a Pattern Recognition Approach", Johnson 
Controls, Inc., Research Brief 228 (Jun. 13, 1986). 

D. E. Seborg, T. F. Edgar, & D. A. Mellichamp, "Process Dynamics and Control", John 
Wiley & Sons, NY (1989) pp. 294-307, 538-541. 

E. H. Bristol & T. W. Kraus, "Life with Pattern Adaptation", Proceedings 1984 American 
Control Conference, pp. 888-892, San Diego, CA (1984). 

Francis Schied, "Shaum's Outline Series-Theory & Problems of Numerical Analysis", 
McGraw-Hill Book Co., NY (1968) pp. 236, 237, 243, 244, 261. 

K. J. Astrom and B. Wittenmark, "Adaptive Control", Addison-Wesley Publishing 
Company (1989) pp. 105-215. 
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K. J. Astrom, T. Hagglund, "Automatic Tuning of PID Controllers", Instrument Society 
of America, Research Triangle Park, NC (1988) pp. 105-132. 

R. W. Haines, "HVAC Systems Design Handbook", TAB Professional and Reference 
Books, Blue Ridge Summit, PA (1988) pp. 170-177. 

S. M. Pandit & S. M. Wu, "Timer Series & System Analysis with Applications", John 
Wiley & Sons, Inc., NY (1983) pp. 200-205. 

T. W. Kraus 7 T. J. Myron, "Self-Tuning PID Controller Uses Pattern Recognition 
Approach", Control Engineering, pp. 106-111, Jun. 1984. 

PATTERN RECOGNITION 

Another aspect of some embodiments of the invention relates to signal analysis and 
complex pattern recognition. This aspect encompasses analysis of any data set presented to the 
system: internal, user interface, or the environment in which it operates. While semantic, optical 
and audio analysis systems are known, the invention is by no means limited to these types of 
data. 

Pattern recognition involves examining a complex data set to determine similarities (in its 
broadest context) with other data sets, typically data sets that have been previously characterized. 
These data sets may comprise multivariate inputs, sequences in time or other dimension, or a 
combination of both multivariate data sets with multiple dimensions. 

The following cited patents and publications are relevant to pattern recognition and 
control aspects of the present invention, and are herein expressly incorporated by reference: 

U.S. Patent 5,067,163, incorporated herein by reference, discloses a method for 
determining a desired image signal range from an image having a single background, in 
particular a radiation image such as a medical X-ray. This reference teaches basic image 
enhancement techniques. 

U.S. Patent 5,068,664, incorporated herein by reference, discloses a method and device 
for recognizing a target among a plurality of known targets, by using a probability based 
recognition system. This patent document cites a number of other references, which are relevant 
to the problem of image recognition: 

Appriou, A., "Interet des theories de 1'incertain en fusion de donnees", Colloque 
International sur le Radar Paris, 24-28 avril 1989. 

Appriou, A., "Procedure d'aide a la decision multi-informateurs. Applications a la 
classification multi-capteurs de cibles", Symposium de l'Avionics Panel (AGARD) Turquie, 
25-29 avril 1988. 



Hoffberg et al. 



- 17- 



LIH-13 



# 



Arrow, K. J., "Social choice and individual valves", John Wiley and Sons Inc. (1963). 

Bellman, R. E., L. A. Zadeh, "Decision making in a fuzzy environment", Management 
Science, 17(4) (December 1970). 

Bhatnagar, R. K., L. N. Kamal, "Handling uncertain information: a review of numeric 
5 and non-numeric methods", Uncertainty in Artificial Intelligence, L. N. Kamal and J. F. Lemmer, 
Eds. (1986). 

Blair, D., R. Pollack, "La logique du choix collectif" Pour la Science (1983). 
Chao, J. J., E. Drakopoulos, C. C. Lee, "An evidential reasoning approach to distributed 
multiple hypothesis detection", Proceedings of the 20th Conference on decision and control, Los 
10 Angeles, Calif., December 1987. 

Dempster, A. P., "A generalization of Bayesian inference", Journal of the Royal 
Statistical Society, Vol. 30, Series B (1968). 

Dempster, A. P., "Upper and lower probabilities induced by a multivalued mapping", 
Annals of mathematical Statistics, no. 38 (1967). 
15 Dubois, D., "Modeles mathematiques de l'imprecis et de l'incertain en vue duplications 

aux techniques d'aide a la decision", Doctoral Thesis, University of Grenoble (1983). 
Dubois, D., N. Prade, "Combination of uncertainty with belief functions: a 
xp; reexamination", Proceedings 9th International Joint Conference on Artificial Intelligence, Los 
€| Angeles (1985). 

jgo Dubois, D., N. Prade, "Fuzzy sets and systems-Theory and applications", Academic 

H Press, New York (1980). 



U 



Dubois, D., N. Prade, "Theorie des possibilites: application a la representation des 
connaissances en informatique", Masson, Paris (1985). 
p Duda, R. O., P. E. Hart, M. J. Nilsson, "Subjective Bayesian methods for rule-based 

FES inference systems", Technical Note 124-Artificial Intelligence Center-SRI International. 
O Fua, P. V., "Using probability density functions in the framework of evidential reasoning 

Uncertainty in knowledge based systems", B. Bouchon, R. R. Yager, Eds. Springer Verlag 
(1987). 

Ishizuka, M, "Inference methods based on extended Dempster and Shafer's theory for 
30 problems with uncertainty/fuzziness", New Generation Computing, 1:159-168 (1983), Ohmsha, 
Ltd, and Springer Verlag. 

Jeffrey, R. J., "The logic of decision", The University of Chicago Press, Ltd., London 
(1983)(2nd Ed.). 

Kaufmann, A., "Introduction a la theorie des sous-ensembles flous", Vol. 1, 2 et 
35 3-Masson-Paris (1975). 

Keeney, R. L., B. Raiffa, "Decisions with multiple objectives: Preferences and value 
tradeoffs", John Wiley and Sons, New York (1976). 

Ksienski et al., "Low Frequency Approach to Target Identification", Proc. of the IEEE, 
63(12):1651-1660 (Dec. 1975). 
40 Kyburg, H. E. ; "Bayesian and non Bayesian evidential updating", Artificial Intelligence 

31:271-293 (1987). 

Roy, B., "Classements et choix en presence de points de vue multiples", R.I.R.O.-2eme 
annee-no. 8, pp. 57-75 (1968). 

Roy, B., "Electre III: un algorithme de classements fonde sur une representation floue des 
45 preferences en presence de criteres multiples", Cahiers du CERO, 20(l):3-24 (1978). 
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Scharlic, A., "Decider sur plusieurs criteres. Panorama de l'aide a la decision multicritere" 
Presses Polytechniques Romandes (1985). 

Shafer, G., "A mathematical theory of evidence", Princeton University Press, Princeton, 
New Jersey (1976). 

5 Sugeno, M., "Theory of fuzzy integrals and its applications", Tokyo Institute of 

Technology (1974). 

Vannicola et al, "Applications of Knowledge based Systems to Surveillance", 
Proceedings of the 1988 IEEE National Radar Conference, 20-21 Apr. 1988, pp. 157-164. 
Yager, R. R., "Entropy and specificity in a mathematical theory of Evidence", Int. J. 
10 General Systems, 9:249-260 (1983). 

Zadeh, L. A., "Fuzzy sets as a basis for a theory of possibility", Fuzzy sets and Systems 
1:3-28 (1978). 

Zadeh, L. A., "Fuzzy sets", Information and Control, 8:338-353 (1965). 
Zadeh, L. A., "Probability measures of fuzzy events", Journal of Mathematical Analysis 
15 and Applications, 23:421-427 (1968). 

O 

U.S. Patent No. 5,067,161, incorporated herein by reference, relates to a video image 
J pattern recognition system, which recognizes objects in near real time. 

U.S. Patent Nos. 4,817,176 and 4,802,230, both incorporated herein by reference, relate 
Nio to harmonic transform methods of pattern matching of an undetermined pattern to known 
s patterns, and are useful in the pattern recognition method of the present invention. U.S. Patent 

4,998,286, incorporated herein by reference, relates to a harmonic transform method for 
=j comparing multidimensional images, such as color images, and is useful in the present pattern 
recognition methods. 

25 U.S. Patent 5,067,166, incorporated herein by reference, relates to a pattern recognition 

system, in which a local optimum match between subsets of candidate reference label sequences 
and candidate templates. It is clear that this method is useful in the pattern recognition aspects of 
the present invention. It is also clear that the interface and control system of the present 
invention are useful adjuncts to the method disclosed in U.S. Patent 5,067,166. 

30 U.S. Patent 5,048,095, incorporated herein by reference, relates to the use of a genetic 

learning algorithm to adaptively segment images, which is an initial stage in image recognition. 
This patent has a software listing for this method. It is clear that this method is useful in the 
pattern recognition aspects of the present invention. It is also clear that the interface and control 
system of the present invention are useful adjuncts to the method disclosed in U.S. Patent 

35 5,048,095. 
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FRACTAL-BASED IMAGE PROCESSING 

Fractals are a relatively new field of science and technology that relate to the study of 
order and chaos. While the field of fractals is now very dense, a number of relevant principles 
are applicable. First, when the coordinate axes of a space are not independent, and are related by 
a recursive algorithm, then the space is considered to have a fractional dimensionality. One 
characteristic of such systems is that a mapping of such spaces tends to have self-similarity on a 
number of scales. Interestingly, natural systems have also been observed to have self-similarity 
over several orders of magnitude, although as presently believed, not over an unlimited range of 
scales. Therefore, one theory holds that images of natural objects may be efficiently described 
by iterated function systems (IFS), which provide a series of parameters for a generic formula or 
algorithm, which, when the process is reversed, is visually similar to the starting image. Since 
the "noise" of the expanded data is masked by the "natural" appearance of the result, visually 
acceptable image compression may be provided at relatively high compression ratios 
accompanied by substantial loss of true image information. This theory remains the subject of 
significant debate, and, for example, wavelet algorithm advocates claim superior results for a 
more general set of starting images. It is noted that, on a mathematical level, wavelets and fractal 
constructs are similar or overlapping. 

U.S. Patent Nos. 5,065,447, and 4,941,193, both incorporated herein by reference, relate 
to the compression of image data by using fractal transforms. These are discussed in detail 
below. U.S. Patent 5,065,447 cites a number of references, relevant to the use of fractals in 
image processing: 

U.S. Patent No. 4,831,659. 

"A New Class of Markov Processes for Image Encoding", School of Mathematics, 
Georgia Inst, of Technology (1988), pp. 14-32. 

"Construction of Fractal Objects with Iterated Function Systems", Siggraph '85 
Proceedings, 19(3):271-278 (1985). 

"Data Compression: Pntng by Numbrs", The Economist, May 21, 1988. 

"Fractal Geometry-Understanding Chaos", Georgia Tech Alumni Magazine, p. 16 (Spring 

1986). 

"Fractal Modelling of Biological Structures", Perspectives in Biological Dynamics and 
Theoretical Medicine, Koslow, Mandell, Shlesinger, eds., Annals of New York Academy of 
Sciences, vol. 504, 179-194 (date unknown). 
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"Fractal Modelling of Real World Images, Lecture Notes for Fractals: Introduction, 
Basics and Perspectives", Siggraph (1987). 

"Fractals-A Geometry of Nature", Georgia Institute of Technology Research Horizons, p. 
9 (Spring 1986). 

A. Jacquin, "A Fractal Theory of Iterated Markov Operators with Applications to Digital 
Image Coding ", PhD Thesis, Georgia Tech, 1989. 

A. Jacquin, "Image Coding Based on a Fractal Theory of Iterated Contractive Image 
Transformations " p. 18, January 1992 (Vol 1 Issue 1) of IEEE Trans on Image Processing. 

A. Jacquin, 'Fractal image coding based on a theory of iterated contractive image 
transformations', Proc. SPIE Visual Communications and Image Processing, 1990, pages 227- 
239. 

A.E. Jacquin, 'A novel fractal block-coding technique for digital images', Proc. ICASSP 

1990. 

Baldwin, William, "Just the Bare Facts, Please", Forbes Magazine, Dec. 12, 1988. 
Barnsley et al., "A Better Way to Compress Images", Byte Magazine, Jan. 1988, pp. 213- 

225. 

Barnsley et al., "Chaotic Compression", Computer Graphics World, Nov. 1987. 

Barnsley et al., "Harnessing Chaos For Images Synthesis", Computer Graphics, 
22(4):131-140 (August, 1988). 

Barnsley et al., "Hidden Variable Fractal Interpolation Functions", School of 
Mathematics, Georgia Institute of Technology, Atlanta, GA. 30332, Jul., 1986. 

Barnsley, M.F., "Fractals Everywhere", Academic Press, Boston, MA, 1988. 

Barnsley, M.F., and Demko, S., "Iterated Function Systems and The Global Construction 
of Fractals", Proc. R. Soc. Lond., A399:243-275 (1985). 

Barnsley, M.F., Ervin, V., Hardin, D., Lancaster, J., "Solution of an Inverse Problem for 
Fractals and Other Sets", Proc. Natl. Acad. Sci. U.S.A., 83:1975-1977 (Apr. 1986). 

Beaumont J M, "Image data compression using fractal techniques British Telecom 
Technological Journal 9(4):93-108 (1991). 

Byte Magazine, Jan. 1988, supra, cites: 

D.S. Mazel, Fractal Modeling of Time-Series Data, PhD Thesis, Georgia Tech, 1991. 
(One dimensional, not pictures). 

Derra, Skip, "Researchers Use Fractal Geometry, .", Research and Development 
Magazine, Mar. 1988. 

Elton, J., "An Ergodic Theorem for Iterated Maps", Journal of Ergodic Theory and 
Dynamical Systems, 7 (1987). 

Fisher Y, "Fractal image compression ", Siggraph 92. 

Fractal Image Compression Michael F. Barnsley and Lyman P. Hurd ISBN 0-86720-457- 
5, ca. 250 pp. 

Fractal Image Compression: Theory and Application, Yuval Fisher (ed.), Springer 
Verlag, New York, 1995. ISBN number 0-387-9421 1-4. 

Fractal Modelling of Biological Structures, School of Mathematics, Georgia Institute of 
Technology (date unknown). 

G.E. Oien, S. Lepsoy & T.A. Ramstad, 'An inner product space approach to image coding 
by contractive transformations', Proc. ICASSP 1991, pp 2773-2776. 

Gleick, James, "Making a New Science", pp. 215, 239, date unknown. 
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Graf S, "Barnsley's Scheme for the Fractal Encoding of Images ", Journal Of 
Complexity, V8, 72-78 (1992). 

Jacobs, E.W., Y. Fisher and R.D. Boss. "Image Compression: A study of the Iterated 
Transform Method. Signal Processing 29, (1992) 25-263. 

M. Barnsley, L. Anson, "Graphics Compression Technology, SunWorld, October 1991, 
pp. 42-52. 

M.F. Barnsley, A. Jacquin, F. Malassenet, L. Reuter & A.D. Sloan, 'Harnessing chaos for 
image synthesis', Computer Graphics, vol 22 no 4 pp 131-140, 1988. 

M.F. Barnsley, A.E. Jacquin, 'Application of recurrent iterated function systems to 
images', Visual Comm. and Image Processing, vol SPIE-1001, 1988. 

Mandelbrot, B., "The Fractal Geometry of Nature", W.H. Freeman & Co., San Francisco, 
CA, 1982, 1977. 

Monro D M and Dudbridge F, "Fractal block coding of images ", Electronics Letters 
28(11):1053-1054 (1992). 

Monro D.M. & Dudbridge F. 'Fractal approximation of image blocks', Proc ICASSP 92, 
pp. Ill: 485-488. 

Monro D.M. 'A hybrid fractal transform', Proc ICASSP 93, pp. V: 169-72. 

Monro D.M., Wilson D., Nicholls J. A. 'High speed image coding with the Bath Fractal 
Transform', IEEE International Symposium on Multimedia Technologies Southampton, April 
1993. 

Peterson, Ivars, "Packing It In-Fractals .." , Science News, 1 31 (18) :283-285 (May 2, 

1987). 

S. A. Hollatz, "Digital image compression with two-dimensional affine fractal 
interpolation functions ", Department of Mathematics and Statistics, University of Minnesota- 
Duluth, Technical Report 91-2. (a nuts-and-bolts how-to-do-it paper on the technique). 

Stark, J., "Iterated function systems as neural networks ", Neural Networks, Vol 4. pp 
679-690, Pergamon Press, 1991. 

Vrscay, Edward R. "Iterated Function Systems: Theory, Applications, and the Inverse 
Problem. Fractal Geometry and Analysis , J. Belair and S. Dubuc (eds.) Kluwer Academic, 1991. 
405-468. 

U.S. Patent No. 5,347,600, incorporated herein by reference, relates to a method and 
apparatus for compression and decompression of digital image data, using fractal methods. 
According to this method, digital image data is automatically processed by dividing stored image 
data into domain blocks and range blocks. The range blocks are subjected to processes such as a 
shrinking process to obtain mapped range blocks. The range blocks or domain blocks may also 
be processed by processes such as affine transforms. Then, for each domain block, the mapped 
range block which is most similar to the domain block is determined, and the address of that 
range block and the processes the blocks were subjected to are combined as an identifier which is 
appended to a list of identifiers for other domain blocks. The list of identifiers for all domain 
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blocks is called a fractal transform and constitutes a compressed representation of the input 
image. To decompress the fractal transform and recover the input image, an arbitrary input 
image is formed into range blocks and the range blocks processed in a manner specified by the 
identifiers to form a representation of the original input image. 



Contract Sponsored by the Office of Naval Research, Contract No. N00014-91-C-01 17, 
Netrologic Inc., San Diego, California (June 2, 1993), relates to various methods of compressing 
image data, including fractals and wavelets. This method may also be applicable in pattern 
recognition applications. This reference provides theory and comparative analysis of 
10 compression schemes. 

^ A fractal-processing method based image extraction method is described in Kim, D.H.; 

Caulfield, H.J.; Jannson, T.; Kostrzewski, A.; Savant, G, "Optical fractal image processor for 

yD noise-embedded targets detection", Proceedings of the SPIE - The International Society for 

q Optical Engineering , Vol. 2026, p. 144-9 (1993) (SPIE Conf: Photonics for Processors, Neural 
Networks, and Memories 12-15 July 1993, San Diego, CA, USA). According to this paper, a 
fractal dimensionality measurement and analysis-based automatic target recognition (ATR) is 

fll described. The ATR is a multi-step procedure, based on fractal image processing, and can 
simultaneously perform preprocessing, interest locating, segmenting, feature extracting, and 

2 classifying. See also, Cheong, C.K.; Aizawa, K.; Saito, T.; Hatori, M., "Adaptive edge detection 

20 with fractal dimension", Transactions of the Institute of Electronics, Information and 

Communication Engineers D-II , J76D-II(ll):2459-63 (1993); Hayes, H.I.; Solka, J.L.; Priebe, 
C.E.; "Parallel computation of fractal dimension", Proceedings of the SPIE - The International 
Society for Optical Engineering , 1962:219-30 (1993); Priebe, C.E.; Solka, J.L.; Rogers, G.W., 
"Discriminant analysis in aerial images using fractal based features", Proceedings of the SPIE - 

25 The International Society for Optical Engineering , 1962:196-208(1993). See also, Anson, L., 

"Fractal Image Compression", Byte, October 1993, pp. 195-202; "Fractal Compression Goes On- 
Line", Byte, September 1993. 

Methods employing other than fractal-based algorithms may also be used. See, e.g., Liu, 
Y., "Pattern recognition using Hilbert space", Proceedings of the SPIE - The International 

30 Society for Optical Engineering , 1825:63-77 (1992), which describes a learning approach, the 



5 



Image Compression Using Fractals and Wavelets", Final Report for the Phase II 
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Hilbert learning. This approach is similar to Fractal learning, but the Fractal part is replaced by 
Hilbert space. Like the Fractal learning, the first stage is to encode an image to a small vector in 
the internal space of a learning system. The next stage is to quantize the internal parameter 
space. The internal space of a Hilbert learning system is defined as follows: a pattern can be 
interpreted as a representation of a vector in a Hilbert space. Any vectors in a Hilbert space can 
be expanded. If a vector happens to be in a subspace of a Hilbert space where the dimension L of 
the subspace is low (order of 10), the vector can be specified by its norm, an L- vector, and the 
Hermitian operator which spans the Hilbert space, establishing a mapping from an image space 
to the internal space P. This mapping converts an input image to a 4-tuple: t in P=(Norm, T, N, 
L-vector), where T is an operator parameter space, N is a set of integers which specifies the 
boundary condition. The encoding is implemented by mapping an input pattern into a point in its 
internal space. The system uses local search algorithm, i.e., the system adjusts its internal data 
locally. The search is first conducted for an operator in a parameter space of operators, then an 
error function delta (t) is computed. The algorithm stops at a local minimum of delta (t). Finally, 
the input training set divides the internal space by a quantization procedure. See also, Liu, Y., 
"Extensions of fractal theory", Proceedings of the SPIE - The International Society for Optical 
Engineering , 1966:255-68(1993). 

Fractal methods may be used for pattern recognition. See, Sadjadi, F., "Experiments in 
the use of fractal in computer pattern recognition", Proceedings of the SPIE - The International 
Society for Optical Engineering , 1960:214-22(1993). According to this reference, man-made 
objects in infrared and millimeter wave (MMW) radar imagery may be recognized using fractal- 
based methods. The technique is based on estimation of the fractal dimensions of sequential 
blocks of an image of a scene and slicing of the histogram of the fractal dimensions computed by 
Fourier regression. The technique is shown to be effective for the detection of tactical military 
vehicles in IR, and of airport attributes in MMW radar imagery. 

In addition to spatial self-similarity, temporal self-similarity may also be analyzed using 
fractal methods. See, Reusens, E., "Sequence coding based on the fractal theory of iterated 
transformations systems", Proceedings of the SPIE - The International Society for Optical 
Engineering , 2094(pt.l):132-40(1993). This reference describes a scheme based on the iterated 
functions systems theory that relies on a 3D approach in which the sequence is adaptively 
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partitioned. Each partition block can be coded either by using the spatial self-similarities or by 
exploiting temporal redundancies. Audio and Radar data are typically susceptible to such 
analysis to produce particularly useful results, due to the existence of echoes and relatively 
transfer functions (including resonant features). 

Fractal compression methods may be used for video data for transmission. See, Hurtgen, 
B.; Buttgen, P., "Fractal approach to low rate video coding", Proceedings of the SPIE - The 
International Society for Optical Engineering , 2094(pt.l):120-31 (1993). This reference relates to 
a method for fast encoding and decoding of image sequences on the basis of fractal coding theory 
and the hybrid coding concept. The DPCM-loop accounts for statistical dependencies of natural 
image sequences in the temporal direction. Those regions of the original image where the 
prediction, i.e. motion estimation and compensation, fails are encoded using an advanced fractal 
coding scheme, suitable for still images, and whose introduction instead of the commonly used 
Discrete Cosine Transform (DCT)-based coding is advantageous especially at very low bit rates 
(8-64 kbit/s). In order to increase reconstruction quality, encoding speed and compression ratio, 
some additional features such as hierarchical codebook search and multilevel block segmentation 
may be employed. This hybrid technique may be used in conjunction with the present adaptive 
interface or other features of the present invention. 

Fractal methods may be used to segment an image into objects having various surface 
textures. See, Zhi-Yan Xie; Brady, M., "Fractal dimension image for texture segmentation", 
ICARCV '92. Second International Conference on Automation, Robotics and Computer Vision, 
p. CV-4.3/1-5 vol.1, (1992). According to this reference, the fractal dimension and its change 
over boundaries of different homogeneous textured regions is analyzed and used to segment 
textures in infrared aerial images. Based on the fractal dimension, different textures map into 
different fractal dimension image features, such that there is smooth variation within a single 
homogeneous texture but sharp variation at texture boundaries. Since the fractal dimension 
remains unchanged under linear transformation, this method is robust for dismissing effects 
caused by lighting and other extrinsic factors. Morphology is the only tool used in the 
implementation of the whole process: texture feature extraction, texture segmentation and 
boundary detection. This makes possible parallel implementations of each stage of the process. 
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Rahmati, M.; Hassebrook, L.G., "Intensity- and distortion-invariant pattern recognition 
with complex linear morphology", Pattern Recognition , 27 (4):549-68(1994) relates to a unified 
model based pattern recognition approach is introduced which can be formulated into a variety of 
techniques to be used for a variety of applications. In this approach, complex phasor addition and 
cancellation are incorporated into the design of filter(s) to perform implicit logical operations 
using linear correlation operators. These implicit logical operations are suitable to implement 
high level gray scale morphological transformations of input images. In this way non-linear 
decision boundaries are effectively projected into the input signal space yet the mathematical 
simplicity of linear filter designs is maintained. This approach is applied to the automatic 
distortion- and intensity-invariant object recognition problem. A set of shape operators or 
complex filters is introduced which are logically structured into a filter bank architecture to 
accomplish the distortion and intensity-invariant system. This synthesized complex filter bank is 
optimally sensitive to fractal noise representing natural scenery. The sensitivity is optimized for 
a specific fractal parameter range using the Fisher discriminant. The output responses of the 
proposed system are shown for target, clutter, and pseudo-target inputs to represent its 
discrimination and generalization capability in the presence of distortion and intensity variations. 
Its performance is demonstrated with realistic scenery as well as synthesized inputs. 

Sprinzak, J.; Werman, M, "Affine point matching", Pattern Recognition Letters , 
15(4):337-9(1994), relates to a pattern recognition method. A fundamental problem of pattern 
recognition, in general, is recognizing and locating objects within a given scene. The image of 
an object may have been distorted by different geometric transformations such as translation, 
rotation, scaling, general affine transformation or perspective projection. The recognition task 
involves finding a transformation that superimposes the model on its instance in the image. This 
reference proposes an improved method of superimposing the model. 

TEMPORAL IMAGE ANALYSIS 

Temporal image analysis is a well-known field. This field holds substantial interest at 
present for two reasons. First, by temporal analysis of a series of two-dimensional images, 
objects and object planes (including motion planes) may be defined, which provide basis for 
efficient yet general algorithms for video compression, such as the Motion Picture Experts Group 
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(MPEG) series of standards. Second, temporal analysis has applications in signal analysis for an 
understanding and analysis of the signal itself. 

U.S. Patent No. 5,280,530, incorporated herein by reference, relates to a method and 
apparatus for tracking a moving object in a scene, for example the face of a person in videophone 
applications, comprises forming an initial template of the face, extracting a mask outlining the 
face, dividing the template into a plurality (for example sixteen) sub-templates, searching the 
next frame to find a match with the template, searching the next frame to find a match with each 
of the sub-templates, determining the displacements of each of the sub-templates with respect to 
the template, using the displacements to determine affine transform coefficients and performing 
an affine transform to produce an updated template and updated mask. 

U.S. Patent No. 5,214,504 relates to a moving video image estimation system, based on 
an original video image of time n and time n+1, the centroid, the principal axis of inertia, the 
moment about the principal axis of inertia and the moment about the axis perpendicular to the 
principal axis of inertia are obtained. By using this information, an affine transformation for 
transforming the original video image at time n to the original video image at time n+1 is 
obtained. Based on the infinitesimal transformation (A), {e At , and e AU1) } obtained by making the 
affine transformation continuous with regard to time is executed on the original video image at 
time n and time n+1. The results are synthesized to perform an interpolation between the frames. 
{e (a(t l) } is applied to the original video system time n+1. The video image after time n+1 is 
thereby protected. 

U.S. Patent No. 5,063,603, incorporated herein by reference, relates to a dynamic method 
for recognizing objects and image processing system therefor. This reference discloses a method 
of distinguishing between different members of a class of images, such as human beings. A time 
series of successive relatively high-resolution frames of image data, any frame of which may or 
may not include a graphical representation of one or more predetermined specific members (e.g., 
particular known persons) of a given generic class (e.g. human beings), is examined in order to 
recognize the identity of a specific member; if that member's image is included in the time series. 
The frames of image data may be examined in real time at various resolutions, starting with a 
relatively low resolution, to detect whether some earlier-occurring frame includes any of a group 
of image features possessed by an image of a member of the given class. The image location of a 
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detected image feature is stored and then used in a later-occurring, higher resolution frame to 
direct the examination only to the image region of the stored location in order to (1) verify the 
detection of the aforesaid image feature, and (2) detect one or more other of the group of image 
features, if any is present in that image region of the frame being examined. By repeating this 
type of examination for later and later occurring frames, the accumulated detected features can 
first reliably recognize the detected image region to be an image of a generic object of the given 
class, and later can reliably recognize the detected image region to be an image of a certain 
specific member of the given class. Thus, a human identity recognition feature of the present 
invention may be implemented in this manner. Further, it is clear that this recognition feature 
may form an integral part of certain embodiments of the present invention. It is also clear that 
the various features of the present invention would be applicable as an adjunct to the various 
elements of the system disclosed in U.S. Patent 5,063,603. 

U.S. Patent No. 5,067,160, incorporated herein by reference, relates to a motion-pattern 
recognition apparatus, having adaptive capabilities. The apparatus recognizes a motion of an 
object that is moving and is hidden in an image signal, and discriminates the object from the 
background within the signal. The apparatus has an image-forming unit comprising non-linear 
oscillators, which forms an image of the motion of the object in accordance with an adjacent- 
mutual-interference-rule, on the basis of the image signal. A memory unit, comprising 
non-linear oscillators, stores conceptualized meanings of several motions. A retrieval unit 
retrieves a conceptualized meaning close to the motion image of the object. An altering unit 
alters the rule, on the basis of the conceptualized meaning. The image forming unit, memory 
unit, retrieval unit and altering unit form a holonic-loop. Successive alterations of the rules by 
the altering unit within the holonic loop change an ambiguous image formed in the image 
forming unit into a distinct image. U.S. Patent 5,067,160 cites the following references, which 
are relevant to the task of discriminating a moving object in a background: 

U.S. Patent No. 4,710,964. 

Shimizu et al, "Principle of Holonic Computer and Holovision", Journal of the Institute of 
Electronics, Information and Communication, 70(9):921-930 (1987). 

Omata et al, "Holonic Model of Motion Perception", IEICE Technical Reports, 3/26/88, 
pp. 339-346. 

Ohsuga et al, "Entrainment of Two Coupled van der Pol Oscillators by an External 
Oscillation", Biological Cybernetics, 51:225-239 (1985). 
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U.S. Patent No. 5,065,440, incorporated herein by reference, relates to a pattern 
recognition apparatus, which compensates for, and is thus insensitive to pattern shifting, thus 
being useful for decomposing an image or sequence of images, into various structural features 
and recognizing the features. U.S. Patent 5,065,440 cites the following references, incorporated 
herein by reference, which are also relevant to the present invention: U.S. Patent Nos. 4,543,660, 
4,630,308, 4,677,680, 4,809,341, 4,864,629, 4,872,024 and 4,905,296. 

Recent analyses of fractal image compression techniques have tended to imply that, other 
than in special circumstances, other image compression methods are "better" than a Barnsley- 
type image compression system, due to the poor performance of compression processors and 
lower than expected compression ratios. Further, statements attributed to Barnsley have 
indicated that the Barnsley technique is not truly a "fractal" technique, but rather a vector 
quantization process that employs a recursive library. Nevertheless, these techniques and 
analyses have their advantages. As stated hereinbelow, the fact that the codes representing the 
compressed image are hierarchical represents a particular facet exploited by the present 
invention. 

Another factor which makes fractal methods and analysis relevant to the present 
invention is the theoretical relation to optical image processing and holography. Thus, while 
such optical systems may presently be cumbersome and economically unfeasible, and their 
implementation in software models slow, these techniques nevertheless hold promise and present 
distinct advantages. 

BIOMETRIC ANALYSIS 

Biometric analysis comprises the study of the differences between various organisms, 
typically of the same species. Thus, the intraspecies variations become the basis for 
differentiation and identification. In practice, there are many applications for biometric analysis 
systems, for example in security applications, these allow identification of a particular human. 

U.S. Patent No. 5,055,658, incorporated herein by reference, relates to a security system 
employing digitized personal characteristics, such as voice. The following references are cited: 

"Voice Recognition and Speech Processing", Elektor Electronics, Sep. 1985, pp. 56-57. 
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Naik et al., "High Performance Speaker Verification .", ICASSP 86, Tokyo, 
CH2243-4/86/0000-0881, IEEE 1986, pp. 881-884. 

Shinan et al„ "The Effects of Voice Disguise .", ICASSP 86, Tokyo, 
CH2243-4/86/0000-0885, IEEE 1986, pp. 885-888. 

Parts of this system relating to speaker recognition may be used to implement a voice 
recognition system of the present invention for determining an actor or performer in a broadcast. 



NEURAL NETWORKS 

Neural networks are a particular type of data analysis tool. There are characterized by the 
fact that the network is represented by a set of "weights", which are typically scalar values, 
which are derived by a formula which is designed to reduce the error between the a data pattern 
representing a known state and the network's prediction of that state. These networks, when 
provided with sufficient complexity and an appropriate training set, may be quite sensitive and 
precise. Further, the data pattern may be arbitrarily complex (although the computing power 
required to evaluate the output will also grow) and therefore these systems may be employed for 
video and other complex pattern analysis. 

U.S. Patent No. 5,067,164, incorporated herein by reference, relates to a hierarchical 
constrained automatic learning neural network for character recognition, and thus represents an 
example of a trainable neural network for pattern recognition, which discloses methods which are 
useful for the present invention. This Patent cites various references of interest: 

U. S. Patent Nos. 4,760,604, 4,774,677 and 4,897,811. 

LeCun, Y., Connectionism in Perspective, R. Pfeifer, Z. Schreter, F. Fogelman, L. Steels, 
(Eds.), 1989, "Generalization and Network Design Strategies", pp. 143-55. 

LeCun, Y. ; et al., "Handwritten Digit Recognition: Applications of Neural.", IEEE 
Comm. Magazine, pp. 41-46 (Nov. 1989). 

Lippmann, R. P., "An Introduction to Computing with Neural Nets", IEEE ASSP 
Magazine, 4(2):4-22 (Apr. 1987). 

Rumelhart, D. E., et al., Parallel Distr. Proc: Explorations in Microstructure of 
Cognition, vol. 1, 1986, "Learning Internal Representations by Error Propagation", pp. 318-362. 

U.S. Patents 5,048,100, 5,063,601 and 5,060,278, all incorporated herein by reference, 
also relate to neural network adaptive pattern recognition methods and apparatuses. It is clear 
that the methods of 5,048,100, 5,060,278 and 5,063,601 may be used to perform the adaptive 
pattern recognition functions of the present invention. More general neural networks are 
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disclosed in U.S. Patents 5,040,134 and 5,058,184, both incorporated herein be reference, which 
provide background on the use of neural networks. In particular, 5,058,184 relates to the use of 
the apparatus in information processing and feature detection applications. 

U.S. Patent No. 5,058,180, incorporated herein by reference, relates to neural network 
5 apparatus and method for pattern recognition, and is thus relevant to the intelligent pattern 
recognition functions of the present invention. This patent cites the following documents of 
interest: 

U.S. Patent Nos. 4,876,731 and 4,914,708. 

Carpenter, G. A., S. Grossberg, "The Art of Adaptive Pattern Recognition by a 
10 Self-Organizing Neural Network," IEEE Computer, Mar. 1988, pp. 77-88. 

Computer Visions, Graphics, and Image Processing 1987, 37:54-115. 

Grossberg, S., G. Carpenter, "A Massively Parallel Architecture for a Self-Organizing 
q Neural Pattern Recognition Machine," Computer Vision, Graphics, and Image Processing (1987, 
J 37, 54-115), pp. 252-315. 

JC5 Gullichsen E., E. Chang, "Pattern Classification by Neural Network: An Experiment 

yb System for Icon Recognition," ICNN Proceeding on Neural Networks, Mar. 1987, pp. 
^ IV-725-32. 

C; Jackel, L. D., H. P. Graf, J. S. Denker, D. Henderson and I. Guyon, "An Application of 

,j; Neural Net Chips: Handwritten Digit Recognition," ICNN Proceeding, 1988, pp. 11-107-15. 
lio Lippman, R. P., "An Introduction to Computing with Neural Nets," IEEE ASSP 

q Magazine, Apr. 1987, pp. 4-22. 

ffj Pawlicki, T. F., D. S. Lee, J. J. Hull and S. N. Srihari, "Neural Network Models and their 

O Application to Handwritten Digit Recognition," ICNN Proceeding, 1988, pp. 11-63-70. 

fLl 

v jjs Chao, T.-H.; Hegblom, E.; Lau, B.; Stoner, W.W.; Miceli, W.J., "Optoelectronically 

w 

implemented neural network with a wavelet preprocessor", Proceedings of the SPIE - The 
International Society for Optical Engineering , 2026:472-82(1993), relates to an optoelectronic 
neural network based upon the Neocognitron paradigm which has been implemented and 
successfully demonstrated for automatic target recognition for both focal plane array imageries 

30 and range-Doppler radar signatures. A particular feature of this neural network architectural 
design is the use of a shift-invariant multichannel Fourier optical correlation as a building block 
for iterative multilayer processing. A bipolar neural weights holographic synthesis technique 
was utilized to implement both the excitatory and inhibitory neural functions and increase its 
discrimination capability. In order to further increase the optoelectronic Neocognitron's 

35 self-organization processing ability, a wavelet preprocessor was employed for feature extraction 
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preprocessing (orientation, size, location, etc.). A multichannel optoelectronic wavelet processor 
using an e-beam complex-valued wavelet filter is also described. 

Neural networks are important tools for extracting patterns from complex input sets. 
These systems do not require human comprehension of the pattern in order to be useful, although 
human understanding of the nature of the problem is helpful in designing the neural network 
system, as is known in the art. Feedback to the neural network is integral to the training process. 
Thus, a set of inputs is mapped to a desired output range, with the network minimizing an 
"error" for the training data set. Neural networks may differ based on the computation of the 
"error", the optimization process, the method of altering the network to minimize the error, and 
the internal topology. Such factors are known in the art. 

OPTICAL PATTERN RECOGNITION 

Optical image processing holds a number of advantages. First, images are typically 
optical by their nature, and therefore processing by this means may (but not always) avoid a data 
conversion. Second, many optical image processing schemes are inherently or easily performed 
in parallel, improving throughput. Third, optical circuits typically have response times shorter 
than electronic circuits, allowing potentially short cycle times. While many optical phenomena 
may be modeled using electronic computers, appropriate applications for optical computing, such 
as pattern recognition, hold promise for high speed in systems of acceptable complexity. 

U.S. Patent No. 5,060,282, incorporated herein by reference, relates to an optical pattern 
recognition architecture implementing the mean-square error correlation algorithm. This method 
allows an optical computing function to perform pattern recognition functions. U.S. Patent No. 
5,060,282 cites the following references, which are relevant to optical pattern recognition: 

Kellman, P., "Time Integrating Optical Signal Processing", Ph. D. Dissertation, Stanford 
University, 1979, pp. 51-55. 

Molley, P., "Implementing the Difference-Squared Error Algorithm Using An 
Acousto-Optic Processor", SPIE, 1098:232-239, (1989). 

Molley, P., et al., "A High Dynamic Range Acousto-Optic Image Correlator for 
Real-Time Pattern Recognition", SPIE, 938:55-65 (1988). 



Psaltis, D., "Incoherent Electro-Optic Image Correlator", Optical Engineering, 
23(0:12-15 (Jan./Feb. 1984). 

Psaltis, D., "Two-Dimensional Optical Processing Using One-Dimensional Input 
Devices", Proceedings of the IEEE, 72(7):962-974 (Jul. 1984). 
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Rhodes, W. f "Acousto-Optic Signal Processing: Convolution and Correlation", Proc. of 
the IEEE, 69(l):65-79 (Jan. 1981). 

Vander Lugt, A., "Signal Detection By Complex Spatial Filtering", IEEE Transactions 
On Information Theory, IT-10, 2:139-145 (Apr. 1964). 

5 

U.S. Patent Nos. 5,159,474 and 5,063,602, expressly incorporated herein by reference, 
also relate to optical image correlators. Also of interest is Li, H.Y., Y. Qiao and D. Psaltis, 
Applied Optics (April, 1993). See also, Bains, S., "Trained Neural Network Recognizes Faces", 
Laser Focus World, June, 1993, pp. 26-28; Bagley, H. & Sloan, J., "Optical Processing: Ready 
10 For Machine Vision?", Photonics Spectra, August 1993, pp. 101-106. 

Optical pattern recognition has been especially applied to two-dimensional patterns. In 
an optical pattern recognition system, an image is correlated with a set of known image patterns 
represented on a hologram, and the product is a pattern according to a correlation between the 
jE; input pattern and the provided known patterns. Because this is an optical technique, it is 
rls performed nearly instantaneously, and the output information can be reentered into an electronic 

trr. 

H digital computer through optical transducers known in the art. Such a system is described in 

H : Casasent, D., Photonics Spectra, November 1991, pp. 134-140. The references cited therein 

f=; provide further details of the theory and practice of such a system: Lendaris, G.G., and Stanely, 

I 1 G.L., "Diffraction Pattern Sampling for Automatic Target Recognition", Proc. IEEE 58:198-205 

Wo (1979); Ballard, D.H., and Brown, CM., Computer Vision, Prentice Hall, Englewood Cliffs, N.J 

O 
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DEMOGRAPHICALLY TARGETED ADVERTISING THROUGH ELECTRONIC MEDIA 
Since the advent of commercially subsidized print media, attempts have been made to 
optimize the placement and compensation aspects relating to commercial messages or 

30 advertisements in media. In general, advertisers subsidize a large percentage of the cost of mass 
publications and communications, in return for the inclusion and possibly strategic placement of 
advertisements in the publication. Therefore, the cost of advertising in such media includes the 
cost of preparation of the advertisement, a share of the cost of publication and a profit for the 
content provider and other services. Since the advertiser must bear some of the cost of 

35 production and distribution of the content, in addition to the cost of advertisement placement 
itself, the cost may be substantial. The advertiser justifies this cost because the wide public 
reception of the advertisement, typically low cost per consumer "impression", with a related 
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stimulation of sales due to commercial awareness of the advertisers' products and services. 
Therefore, the advertisement is deemed particularly effective if either the audience is very large, 
with ad response proportionate to the size of the audience, or if it targets a particularly receptive 
audience, with a response rate higher than the general population. 

On the other hand, the recipient of the commercial publication is generally receptive of 
the advertisement, even though it incurs a potential inefficiency in terms of increased data 
content and inefficiencies in receiving the content segment, for two reasons. First, the 
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even respond negatively to certain messages. Thus, mass media generally includes a majority of 
retail advertisements, with specialty advertisements relegated, if at all, to a classified section 
which is not interspersed with other content. 

This is the basis for a "least common denominator" theory of marketing, that mass media 
must merchandise to the masses, while specialty media merchandises to selected subpopulations. 
As a corollary, using such types of media, it may be difficult to reach certain specialized 
populations who do not consistently receive a common set of publications or who receive 
primarily publications which are unspecialized or directed to a different specialty. 

Where a recipient has limited time for reviewing media, he or she must divide his or her 
available time between mass media and specialty media. Alternatively, publication on demand 
services have arisen which select content based on a user's expressed interests. Presumably, 
these same content selection algorithms may be applied to commercial messages. However, 
these services are primarily limited distribution, and have content that is as variable as 
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commercial messages. Likewise, mass media often has regionally variable content, such as local 
commercials on television or cable systems, or differing editions of print media for different 
regions. Methods are known for demographic targeting of commercial information to 
consumers; however, both the delivery methods and demographic targeting methods tend to be 
suboptimal. 

Sometimes, however, the system breaks down, resulting in inefficiencies. These result 
where the audience or a substantial proportion thereof is inappropriate for the material presented, 
and thus realize a low response rate for an advertiser or even a negative response for the media 
due to the existence of particular commercial advertisers. The recipients are bombarded with 
inappropriate information, while the advertiser fails to realize optimal return on its advertising 
expenditures. In order to minimize the occurrence of these situations, services are available, 
including A.C. Nielsen Co. and Arbitron, Inc., which seek to determine the demographics of the 
audience of broadcast media. 

U.S. 5,436,653, incorporated herein by reference, relates to a broadcast segment 
recognition system in which a signature representing a monitored broadcast segment is compared 
with broadcast segment signatures in a data base representing known broadcast segments to 
determine whether a match exists. Therefore, the broadcast viewing habits of a user may be 
efficiently and automatically monitored, without pre-encoding broadcasts or the like. 

U.S. 5,459,306, incorporated herein by reference, relates to a method for delivering 
targeting information to a prospective individual user. Personal user information is gathered, as 
well as information on a user's use of a product, correlated and stored. Classes of information 
potentially relevant to future purchases are then identified, and promotions and recommendations 
delivered based on the information and the user information. 

U.S. 5,483,278, incorporated herein by reference, relates to a system having a user 
interface which can access downloaded electronic programs and associated information records, 
and which can automatically correlate the program information with the preferences of the user, 
to create and display a personalized information database based upon the results of the 
correlation. Likewise, U.S. 5,223,914, expressly incorporated herein by reference, relates to a 
system and method for automatically correlating user preferences with a T.V. program 
information database. 
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U.S. Patent No. 5,231,494, expressly incorporated herein by reference, relates to a system 
that selectively extracts one of a plurality of compressed television signals from a single channel 
based on viewer characteristics. 

U.S. Patent No. 5,410,344 relates to a system for selecting video programs based on 
5 viewers preferences, based on content codes of the programs. 

U.S. 5,485,518, incorporated herein by reference, relates to a system for electronic media 
program recognition and choice, allowing, for example, parental control of the individual 
programs presented, without requiring a transmitted editorial code. 



2j protocols over ISDN, packet switched networks and public switched telephone networks, 

H5 respectively. H.324 provides a multimedia information communication and videoconferencing 

I, standard for communication over the standard "plain old telephone system" network ("POTS"), 

h! in which the video signal is compressed using DCT transforms and motion compensation for 

C transmission over a v. 80 synchronous v.34-type modem link. The video image is provided as a 

Q video window with relatively slow frame rate. This image, in turn, may be presented on a 



^0 computer monitor or television system, with appropriate signal conversion. See, Andrew W. 

Davis, "Hi Grandma!: Is It Time for TV Set POTS Videoconferencing?", Advanced Imaging , pp. 
45-49 (March 1997); Jeff Child, "H.324 Paves Road For Mainstream Video Telephony", 
Computer Design , January 1997, pp. 107-110. A newly proposed set of extensions to H.324, 
called H.324/M, provides compatibility with mobile or impaired telecommunications systems, 
25 and accommodates errors and distortions in transmissions, reduced or variable transmission rates 
and other anomalies of known available mobile telecommunications systems, such as Cellular, 
GSM, and PCS. 

Four common standards are employed, which are necessary for videoconferencing 
stations to communicate with each other under common standards. The first is called h.320, and 
30 encompasses relatively high bandwidth systems, in increments of 64 kbits/sec digital 



10 



VIDEOCONFERENCING TECHNOLOGIES 



J53. 



Videoconferencing systems are well known in the art. A number of international 
standards have been defined, providing various telecommunication bandwidth and 
communication link options. For example, H.320, H.323 and H.324 are known transport 



Hoffberg et al. 



-53- 



LIH-13 



communication with a synchronous communication protocol. Generally, these systems 
communicate with 128 kbits/sec, 256 kbits/sec or 384 kbits/sec, over a number of "bonded" 
ISDN B-channels. The second standard h.324, employs a standard POTS communication link 
with a v.80/v.34bis modem, communicating at 33.6 kbits/sec synchronous. The third standard, is 
the newly established H.321 standard, which provides for videoconferencing over a packet 
switched network, such as Ethernet, using IPX or TCP/IP. Finally, there are so-called Internet 
videophone systems, such as Intel Proshare. See, Andrew W. Davis, "The Video Answering 
Machine: Intel ProShare's Next Step", Advanced Imaging , pp. 28-30 (March 1997). 

In known standards-based videoconferencing systems, the image is generally compressed 
using a discrete cosine transform, which operates in the spatial frequency domain. In this 
domain, visually unimportant information, such as low frequencies and high frequency noise are 
eliminated, leaving visually important information. Further, because much of the information in 
a videoconference image is repeated in sequential frames, with possible movement, this 
redundant information is transmitted infrequently and filtered from the transmitted image stream, 
and described with motion vector information. This motion vector information encodes objects 
which are fixed or move somewhat between frames. Such known techniques include H.261, 
with integer pixel motion estimation, and H.263, which provides 1/2 pixel motion estimation. 
Other techniques for video compression are known or have been proposed, such as H.263+, and 
MPEG-4 encoding. Many standard videoconferencing protocols require the initial transmission 
of a full frame image, in order to set both transmitting and receiving stations to the same 
encoding state. The digital data describing this image is typically Huffman encoded for 
transmission. Multiple frames may be combined and coded as a unit, for example as so-called 
PB frames. Other techniques are also known for reducing image data transmission bandwidth for 
various applications, including video conferencing. 

Each remote videoconference terminal has an interface system, which receives the digital 
data, and separates the video information (H.261, H.263), audio information (G.711, G.723, 

G. 723.1), data protocol information (HDLC, V.14, LAPM, etc.) and control information (H.245, 

H. 221/H.223) into discrete streams, which are processed separately. Likewise, each terminal 
interface system also assembles the audio information, video information, data protocols and 
control data for transmission. The control information consists of various types of information; 



Hoffberg et al. 



-54- 



LIH-13 



the standard control protocol which addresses the data format, error correction, exception 
handling, and other types of control; and the multipoint control information, such as which 
remote videoconference terminal (s) to receive audio information from, selective audio muting, 
and such. Generally, the standard, low level control information is processed locally, at the 
codec interface system, and filtered from the remainder of the multipoint control system, with 
only the extracted content information made available to the other stations. 

The ITU has developed a set of multipoint videoconferencing standards or 
recommendations, T.120-T.133, T.RES series, H.231, H.243, etc. These define control schemes 
for multiple party video conferences. Typically, these protocols are implemented in systems that 
either identically replicate the source image data stream from one source to a plurality of 
destinations, or completely decode and reencode the image in a different format in a "transcoder" 
arrangement, to accommodate incompatible conference stations. The ITU standards also allow 
optional data fields which may be used to communicate digital information essentially outside 
the videoconference scheme, and provide data conferencing capabilities, which allow 
videoconferencing and data conferencing to proceed simultaneously. See, ITU T.120-T.127, 
T.130-T.133, T.RES, T. Share and T.TUD recommendations, expressly incorporated herein by 
reference. 

There are a number of known techniques for transmitting and displaying alphanumeric 
data on a television, the most common of which are teletext, used primarily in Europe, and 
closed caption, which is mandated in television sets larger than 13 inches by the Television 
Decoder Circuitry Act of 1990, and Section 305 of the Telecommunications Act of 1996, and 
Federal Communication Commission (FCC) regulations. The American closed caption standard 
is EIA 608. The later is of particular interest because many current generation televisions, 
especially larger sizes, include a closed caption decoder, and thus require no external hardware or 
connections, separate from the hardware and cabling for supplying the video signal. See, TCC 
Tech Facts, Vols. 1-4, (www.wgbh.org, rev. 9/95) expressly incorporated herein by reference. 
The closed caption signal is distributed on Line 21 of the vertical blanking interval (VBI). The 
existing standard supports 480 bits/sec, with a potential increase to 9600 bits/sec in the 
forthcoming ATSC standard. 

Electronic Program Guide (EPG) information and advertising information is presently 
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being transmitted during the VBI in the U.S. by NBC affiliates, using the Gemstar system. 
Proposals exist for distributing such information using a 900 MHz paging network to wireless 
receivers associated with television viewing apparatus, and further to provide bi-directional 
capabilities and electronic commerce integration. 
5 Known systems provide a videoconferencing system which resides in a "set top box", i.e., 

a stand-alone hardware device suitable for situation on top of a television set, providing all of the 
necessary functionality of a videoconferencing system employing the television as the display 
and possibly audio speaker functions. These systems, however, do not integrate the television 
functions, nor provide interaction between the video and videoconferencing systems. C-Phone 
10 Inc., Wilmington NC, provides a C-Phone Home product line which provides extensions to 
H.324 and/or H.320 communications in a set-top box. 

yy Other known videophone and videoconferencing devices are disclosed, e.g., in U.S. 

J Patent Nos. 5,600,646; 5,565,910; 5,564,001; 5,555,443; 5,553,609; 5,548,322; 5,542,102; 

J! 5,537.472; 5,526,405; 5,509,009; 5,500,671 ; 5,490,208; 5,438,357; 5,404,579; 5,374,952; 

3—. '' 

Hs 5,224,151; 4,543,665; 4,491,694; 4,465,902; 4,456,925; 4,427,847; 4,414,432; 4,377,729; 

4,356,509; 4,349,701; 4,338,492; 4,008,376 and 3,984,638 each of which is expressly 
js: incorporated herein by reference. 

0 Known Web/TV devices (from Sony/Magnavox/Philips) allow use of a television to 

1 s I 

O display alphanumeric data, as well as audiovisual data, but formats this data for display outside 
"20 the television. In addition, embedded Web servers are also known. See, Richard A. Quinell, 

"Web Servers in embedded systems enhance user interaction", EDN, April 10, 1997, pp. 61-68, 
incorporated herein by reference. Likewise, combined analog and digital data transmission 
schemes are also known. See. U.S. Patent No. 5,404,579. 

A class of computing devices, representing a convergence of personal computers and 
25 entertainment devices, and which provide network access to the Internet (a publicly available 

network operating over TCP/IP), ITU standards for communications systems allow the selective 
addition of data, according to T.120-T.133, T.RES series of protocols, as well as HDLC, V.14, 
LAPM, to the videoconference stream, especially where excess bandwidth is available for upload 
or download. 

30 A system may be provided with features enabling it to control a so-called smart house 
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and/or to be a part of a security and/or monitoring system, with imaging capability. These 
functions are provided as follows. As discussed above, various data streams may be integrated 
with a videoconference data stream over the same physical link. Therefore, external inputs and 
outputs may be provided to the videophone or videoconference terminal, which maybe processed 
5 locally and/or transmitted over the telecommunications link. The local device, in this case, is 
provided with a continuous connection or an autodial function, to create a communications link 
as necessary. Therefore, heating ventilation and air conditioning control (HVAC), lighting, 
appliances, machinery, valves, security sensors, locks, gates, access points, etc., may all be 
controlled locally or remotely through interfaces of the local system, which may include logic 
10 level signals, relays, serial ports, computer networks, fiber optic interfaces, infrared beams, radio 
^ frequency signals, transmissions through power lines, standard-type computer network 
jp communications (twisted pair, coaxial cable, fiber optic cable), acoustic transmissions and other 
Ji known techniques. Likewise, inputs from various devices and sensors, such as light or optical, 
^; temperature, humidity, moisture, pressure, fluid level, security devices, radio frequency, acoustic, 
^45 may be received and processed locally or remotely. A video and audio signal transmission may 
5 also be combined with the data signals, allowing enhanced remote monitoring and control 
« i possibilities. This information, when transmitted through the telecommunication link, may be 
5 directed to another remote terminal, for example a monitoring service or person seeking to 
G monitor his own home, or intercepted and processed at a central control unit or another device. 
1o Remote events may be monitored, for example, on a closed caption display mode of a television 
attached to a videophone. 

While the preferred embodiments of the invention adhere to established standards, the 
present invention also encompasses communications which deviate from or extend beyond such 
standards, and thus may engage in proprietary communications protocols, between compatible 



30 incorporated herein by reference: U.S. Patent Nos. 3,609,684; 3,849760;3,950,733; 3,967,241; 



25 



units. 



OTHER REFERENCES 



In addition, the following patents are considered relevant to the data compression and 
pattern recognition functions of the apparatus and interface of the present invention and are 
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H 331; and Re. 33,316. The 
aforementioned patents, some of which are mentioned elsewhere in this disclosure, and which 
form a part of this disclosure, may be applied in known manner by those skilled in the art in 
25 order to practice various embodiments of the present invention. 

The following scientific articles, some of which are discussed elsewhere herein, are 
understood by those skilled in the art and relate to the pattern recognition and image compression 
functions of the apparatus and interface of the present invention: 

"Fractal Geometry-Understanding Chaos", Georgia Tech Alumni Magazine, p. 16 (Spring 

30 1986). 
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"Fractal Modelling of Biological Structures", School of Mathematics, Georgia Institute of 
Technology (date unknown). 

"Fractal Modelling of Real World Images", Lecture Notes for Fractals: Introduction, 
Basics and Perspectives, Siggraph (1987). 
5 "Fractals Yield High Compression", Electronic Engineering Times, September 30, 1991, 

p. 39. 

"Fractals-A Geometry of Nature", Georgia Institute of Technology Research Horizons, p. 
9 (Spring 1986). 

"Voice Recognition and Speech Processing", Elektor Electronics, Sep. 1985, pp. 56-57. 
10 Aleksander, L, "Guide to Pattern Recognition Using Random-Access Memories", 

Computers and Digital Techniques, 2(l):29-40 (Feb. 1979). 

Anderson, F., W. Christiansen, B. Kortegaard, "Real Time, Video Image Centroid 
Tracker", Apr. 16-20, 1990. 

Anson, L., M. Barnsley, "Graphics Compression Technology", SunWorld, pp. 43-52 
15 (October 1991). 

Appriou, A., "Interet des theories de l'incertain en fusion de donnees", Colloque 
*i International sur le Radar Paris, 24-28 avril 1989. 

% Appriou, A., "Procedure d'aide a la decision multi-informateurs. Applications a la 

ih classification multi-capteurs de cibles", Symposium de l'Avionics Panel (AGARD) Turquie, 
%20 25-29 avril 1988. 

D Arrow, K. J., "Social choice and individual valves", John Wiley and Sons Inc. (1963). 

y Barnsley et al., "A Better Way to Compress Images", Byte Magazine, Jan. 1988. 

^ Barnsley et al., "Harnessing Chaos For Images Systhesis", Computer Graphics, 22(4) 

(8/1988). 

j§5 Barnsley et al., "Hidden Variable Fractal Interpolation Functions", School of 

Q Mathematics, Georgia Institute of Technology, Atlanta, GA. 30332, Jul., 1986. 

ftj Batchelor, B. G., "Pattern Recognition, Ideas in Practice", Plenum Press, London and 
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SUMMARY AND OBJECTS OF THE INVENTION 
The present invention provides, according to one embodiment, an adaptive user interface 
which changes in response to the context, past history and status of the system. The strategy 
employed preferably seeks to minimize, for an individual user at any given time, the search and 
acquisition time for the entry of data through the interface. 

The interface may therefore provide a model of the user, which is employed in a 
predictive algorithm. The model parameters may be static (once created) or dynamic, and may 
be adaptive to the user or alterations in the use pattern. 

The present invention also provides a model-based pattern recognition system, for 
determining the presence of an object within an image. By providing models of the objects 
within an image, the recognition process is relatively unaffected by perspective, and the 
recognition may take place in a higher dimensionality space than the transmitted media. Thus, 
for example, a motion image may include four degrees of freedom; x, y, chroma/luma, and time. 
A model of an object may include further dimensions, including z, and axes of movement. 
Therefore, the model allows recognition of the object in its various configurations and 
perspectives. 

According to a particular embodiment of the invention, an image or scene, expressed as 
an ordered set of coefficients of an algorithm, wherein the coefficients relate to elements of 
defined variation in scale, and the resulting set of coefficients is related to the underlying image 
morphology, is exploited in order to provide a means for pattern analysis and recognition without 
requiring transformation to an orthogonal coordinate space (e.g., pixels). Typically, the 
expression of the image is compressed with loss of information. 

A major theme of the present invention is the use of intelligent, adaptive pattern 
recognition in order to provide the operator with a small number of high probability choices, 
which may be complex, without the need for explicit definition of each atomic instruction 
comprising the desired action. The interface system predicts a desired action based on the user 
input, a past history of use, a context of use, and a set of predetermined or adaptive rules. 

Because the present invention emphasizes adaptive pattern recognition of both the input 
of the user and data that may be available, the interface system proposes the extensive use of 
advanced signal processing and neural networks. These processing systems may be shared 
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between the interface system and the functional system, and therefore a controller for a complex 
system may make use of the intrinsic processing power available rather than requiring additional 
computing power, although this unification is not required. In the case where the user interface 
employs common hardware elements, it is further preferred that the interface subsystem employ 
common models of the underlying data structures on which the device functionally operates. 

In fact, while hardware efficiency dictates common hardware for the interface system and 
the operational routine, other designs may separate the interface system from the operational 
system, allowing portability and efficient application of a single interface system for a number of 
operational systems. Thus, the present invention also proposes a portable human interface 
system which may be used to control a number of different devices. In this case, a web browser 
metaphor is preferred, as it has become a standard for electronic communications. 

A portable interface may, for example, take the form of a personal digital assistant or 
downloaded JAVA applet, with the data originating in a web server. The data from a web server 
or embedded web server may include a binary file, a generic HTML/XML file, or other data 
type. The interface receives the data and formats it based, at least in part, on parameters specific 
to the client or user. Thus, the presentation of data is responsive to the user, based on user 
preferences, as opposed to hardware limitations or compatibility issues. In a preferred 
embodiment, the data is transmitted separately from the presentation definition. The presentation 
definition, on the other hand, provides a set of parameters that propose or constrain the data 
presentation. The user system also provides a set of parameters that set preferences on 
presentation. Further, the data itself is analyzed for appropriate presentation parameters. These 
three sets of considerations are all inputs into a "negotiation" for an ultimate presentation 
scheme. Thus, the presentation is adaptive to server parameters, user parameters, and the data 
itself. For example, in a typical web-context, the color, size, typestyle, and layout of text may be 
modified based on these considerations. Other factors that may be altered include frame size and 
layout, size of hotspots, requirement for single or double clicks for action, and the like. 

The adaptive nature of the present invention derives from an understanding that people 
learn most efficiently through the interactive experiences of doing, thinking, and knowing. For 
ease-of-use, efficiency, and lack of frustration of the user, the interface of the device should be 
intuitive and self explanatory, providing perceptual feedback to assist the operator in 
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communicating with the interface, which in turn allows the operational system to receive a 
description of a desired operation. Another important aspect of man-machine interaction is that 
there is a learning curve, which dictates that devices which are especially easy to master become 
frustratingly elemental after continued use, while devices which have complex functionality with 
5 many options are difficult to master and may be initially rejected, or the user stops exploring. 
One such system which addresses this problem is U.S. Patent No. 5,005,084, expressly 
incorporated herein by reference. The present invention addresses these issues by determining 
the most likely instructions of the operator, and presenting these as easily available choices, by 
analyzing the past history data and by detecting the "sophistication" of the user in performing a 
10 function, based on all information available to it. The context of use may also be a significant 
factor. The interface seeks to optimize the relevant portion of the interface adaptively and 

O 

yj immediately in order to balance and optimize the interface for both quantitative and qualitative 
:h factors. This functionality may greatly enhance the quality of interaction between man and 
^ machine, allowing a higher degree of overall system sophistication to be tolerated and a greater 

a— J 

Ns value added than other interface designs. See, Commaford, C, "User-Responsive Software Must 

Anticipate Our Needs", PC Week, May 24, 1993. 
S 3 ; The present interface system analyzes data from the user, which may be both the 

j^! selections made by the user in context, as well as the efficiency by which the user achieves the 
O selection. Thus, information concerning both the endpoints and time-dependent path of the 

5— J 

~20 process are considered and analyzed by the interface system. 

The interface of the present invention may be advantageously applied to an operational 
system that has a plurality of functions, certain of which are unnecessary or are rarely used in 
various contexts, while others are used with greater frequency. In such systems, the functionality 
use is usually predictable. Therefore, the present invention provides an optimized interface 
25 system which, upon recognizing a context, dynamically reconfigures the availability or ease of 
availability of functions and allow various subsets to be used through "shortcuts". The interface 
presentation will therefore vary over time, use and the particular user. 

The advantages to be gained by using an intelligent data analysis interface for facilitating 
user control and operation of the system are more than merely reducing the average number of 
30 selections or time to access a given function. Rather, advantages also arise from providing a 
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means for access and availability of functions not necessarily previously existing or known to the 
user, therefore improving the perceived quality and usefulness of the product. Further 
advantages over prior interfaces accrue due to the availability of pattern recognition functionality 
as a part of the interface system. 

In those cases where the pattern recognition functions are applied to large amounts of 
data or complex data sets, in order to provide a sufficient advantage and acceptable response 
time, powerful computational resources, such as advanced DSPs or neural network processors 
are made available to the interface system. On the other hand, where the data is simple or of 
limited scope, aspects of the technology may be easily implemented as added software 
functionality as improvements of existing products having limited computational resources. 

The application of these technologies to multimedia systems provides a new model for 
performing image pattern recognition on multimedia data and for the programming of 
applications including such data. The ability of the interface of the present invention to perform 
abstractions and make decisions regarding a closeness of presented data to selection criteria . 
makes the interface suitable for use in a programmable control, i.e., determining the existence of 
certain conditions and taking certain actions on the occurrence of detected events. Such 
advanced technologies might be especially valuable for disabled users. 

In a multimedia environment, a user often wishes to perform an operation on a 
multimedia data event. Past systems have required explicit indexing of images and events. The 
present technologies, however, allow an image, diagrammatic, abstract or linguistic description 
of the desired event to be acquired by the interface system from the user and applied to identify 
or predict the multimedia event(s) desired without requiring a separate manual indexing or 
classification effort. These technologies may also be applied to single media data. 

The interface system according to the present invention is not limited to a single data 
source, and may analyze data from many different sources for its operation. This data may be 
stored data or present in a data stream. Thus, in a multimedia system, there may be a real-time 
data stream, a stored event database, as well as an exemplar or model database. Further, since 
the device is adaptive, information relating to past experience of the interface, both with respect 
to exposure to data streams and user interaction, is also stored. This data analysis aspect of the 
operation of the present interface system may be substantially processor intensive, especially 
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where the data includes abstract or linguistic concepts or images to be analyzed. Interfaces 
which do not relate to the processing of such data may be implemented on simpler hardware. On 
the other hand, systems which handle complex data types may necessarily include sophisticated 
processors, adaptable for use with the interface system, thus minimizing the additional 
5 computing power necessary in order to implement the interface according to the present 

invention. A portion of the data analysis may also overlap the functional analysis of the data for 
operation. 

A fractal-based image processing system exemplifies one application of the technologies. 
A fractal-based system includes a database of image objects, which may be preprocessed in a 
10 manner which makes them suitable for comparison to a fractal-transformed image representation 
of an image to be analyzed. Thus, corresponding "fractal" transforms are performed on the 

si 
«=■ 

yD unidentified image or a portion thereof and on an exemplar of a database. A degree of 

yp relatedness is determined in this "fractal transform domain", and the results used to identify 

objects within the image. The system then makes decisions based on the information content of 
™"i5 the image, i.e. the objects contained therein. 

5 The fractal-based image processing system presents many advantages. First, fractal- 

D 

n ; processed images may have dramatically reduced storage size requirements as compared to 
jr\ traditional methods while substantially retaining information important for image recognition. 
O The process may be parallelized, and the exemplars may be multidimensional, further facilitating 
20 the process of identifying a two-dimensional projection of an object. The efficient storage of 
information allows the use of inexpensive storage media, i.e., CD-ROM, or the use of an on-line 
database through a serial data link, while allowing acceptable throughput. See, Zenith Starsight 
Telecast brochure, (1994); U.S. Patent No. 5,353,121, expressly incorporated herein by 
reference. 

25 As applied to a multimedia database storage and retrieval system, the user programs, 

through an adaptive user interface according to the present invention, the processing of data, by 
defining a criteria and the actions to be taken based on the determination of the criteria. The 
criteria, it is noted, need not be of a predefined type, and in fact this is a particular feature of the 
present invention. A pattern recognition subsystem is employed to determine the existence of 

30 selected criteria. To facilitate this process, a database of image objects may be stored as two 
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counterparts: first, the data is stored in a compressed format optimized for normal use, such as 
human viewing on a video monitor, using, e.g., MPEG-2 or Joint Photographic Experts Group 
(JPEG) compression; second, it is stored in a preprocessed and highly compressed format 
adapted to be used with the pattern recognition system. Because the preprocessed data is highly 
compressed and used directly by the pattern recognition system, great efficiencies in storage and 
data transmission are achieved. The image preprocessing may include Fourier, DCT, wavelet, 
Gabor, fractal, or model-based approaches, or a combination thereof. 

The potential significant hardware requirement for image processing and pattern 
recognition is counterbalanced by the enhanced functionality available by virtue of the 
technologies. When applied to multimedia devices, the interface system allows the operator to 
define complex criteria with respect to image, abstract or linguistic concepts, which would 
otherwise be difficult or impossible to formulate. Thus, the interface system becomes part of a 
computational system that would otherwise be too cumbersome for use. It is noted that, in many 
types of media streams, a number of "clues" are available defining the content, including close 
caption text, electronic program guides, simulcast data, related Internet web sites, audio tracks, 
image information, and the like. The latter two data types require difficult processing in order to 
extract a semantic content, while the former types are inherently semantic data. 

A pattern recognition subsystem allows a "description" of an "event" without explicit 
definition of the data representing the "event". Thus, instead of requiring explicit programming, 
an operator may merely define parameters of the desired "event". This type of system is useful, 
for example, where a user seeks a generic type of data representing a variety of events. This 
eliminates the need for preindexing or standardized characterization of the data. The interface 
system therefore facilitates the formulation of a request, and then searches the database for data 
which corresponds to the request. Such preindexing or standardized characterization is 
extremely limiting with image and multimedia data, because "a picture is worth a thousand 
words", and without a priori knowing the ultimate search criteria, all possible criteria must be 
accounted for. Pattern recognition systems do not require initial translation of visual aspects into 
linguistic concepts, thus allowing broader searching capability. Of course, a pattern recognition 
system may be used in conjunction with other searching schemes, to mutual advantage. 
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The pattern recognition functionality of the interface system is not limited to multimedia 
data, and may be applied to data of almost any type, e.g., real-time sensor data, distributed 
control, linguistic data, etc. 

It is noted that, in consumer electronics and particularly entertainment applications, the 
reliability of the system need not be perfect, and errors may be tolerable. On the other hand, in 
industrial control applications, reliability must be much higher, with fail-safe backup systems in 
place, as well as advanced error checking. One way to address this issue is to allow the advanced 
user interface to propose an action to the user, without actually implementing the action. 
However, in this case, the action and its proposed basis are preferably presented to the user in a 
sophisticated manner, to allow the basis for the action to be independently assessed by the user. 
Therefore, in a complex, multistep process, the user interface may be simplified by permitting a 
three step process: the user triggers a proposed response, analyzes the proposal and rationale, 
and confirms the proposal. Therefore, single step processes are inferior candidates for intelligent 
assistance. 

Another notable aspect of the technologies is the contextual analysis. Often, multimedia 
data often includes a data component that closely corresponds to a format of a search criteria. 
Thus, while a search may seek a particular image, other portions of the datastream correlate well 
with the aspect of the image being searched, and may be analyzed by proxy, avoiding the need 
for full image analysis. The resulting preselected reduced number of images may then be fully 
analyzed, if necessary. Thus, especially as with respect to consumer electronics applications, 
where absolute accuracy may not be required, the processing power available for pattern 
recognition need not be sufficient for compete real-time signal analysis of all data. The present 
invention therefore proposes use of a variety of available data in order to achieve the desired 
level functionality at minimum cost. 

One aspect of the present invention therefore relates to a mechanism for facilitating a user 
interaction with a programmable device. The interface and method of use of the present 
invention serves to minimize the learning and searching times, better reflect users' expectations, 
provide better matching to human memory limits, be usable by both novices and experienced 
users, reduce intimidation of novice users by the device, reduce errors and simplify the entering 
of programming data. The present invention optimizes the input format scheme for 
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programming an event-driven device, and can also be applied to many types of programmable 
devices. Thus, certain human factors design concepts, heretofore unexploited in the design of 
consumer electronics devices and industrial controls, have been incorporated, and new precepts 
developed. Background and theory of various aspects of the present invention is disclosed in 
"AN IMPROVED HUMAN FACTORED INTERFACE FOR PROGRAMMABLE DEVICES: 
A CASE STUDY OF THE VCR", Master's Thesis, Tufts University (Master of .Sciences in 
Engineering Design, November, 1990, publicly available January, 1991), by Linda I. Hoffberg. 
This thesis, and cited references, are incorporated herein by reference, and attached hereto as an 
appendix. Also referenced are: Hoffberg, Linda I., "Designing User Interface Guidelines For 
Time-Shift Programming of a Video Cassette Recorder (VCR)", Proc. of the Human Factors Soc. 
35th Ann. Mtg. pp. 501-504 (1991); and Hoffberg, Linda I., "Designing a Programmable 
Interface for a Video Cassette Recorder (VCR) to Meet a User's Needs", Interface 91 pp. 346-351 
(1991). See also, U.S. Patent Application No. 07/812,805, filed December 23, 1991, 
incorporated herein by reference in its entirety, including appendices and incorporated 
references. 

The present invention extends beyond simple predictive schemes which present 
exclusively a most recently executed command or most recently opened files. Thus, the possible 
choices are weighted in a multifactorial method, e.g., history of use, context and system status, 
rather than a single simple criterion alone. Known simple predictive criteria often exclude 
choices not previously selected, rather than weighing these choices in context with those which 
have been previously selected. While the system according to the present invention may include 
initial weightings, logical preferences or default settings, through use, the derived weightings are 
obtained adaptively based on an analysis of the status, history of use and context. It is noted that 
not all of the possible choices need be weighted, but rather merely a subset thereof. 

For a given system, status, history of use and context may be interrelated factors. For 
example, the status of the machine is determined by the prior use, while the status also intersects 
context. The intended meaning of status is information relating to a path independent state of the 
machine at a given point in time. History of use is intended to implicate more than the mere 
minimum instructions or actions necessary to achieve a given state, and therefore includes 
information unnecessary to achieve a given state, i.e., path dependent information. Context is 
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also related to status, but rather is differentiated in that context refers to information relating to 
the environment of use, e.g., the variable inputs or data upon which the apparatus acts or 
responds. Status, on the other hand, is a narrower concept relating more to the internal and 
constant functionality of the apparatus, rather than the particularities of its use during specific 
circumstances. 

U.S. Patent No. 5,187,797 relates to a machine interface system having hierarchical 
menus, with a simple (three button) input scheme. The choice(s) presented relate only to the 
system status, and not the particular history of use employed to obtain the system status nor the 
context of the choice. This system has a predetermined hierarchical menu structure, which is 
invariant with usage. The goal of this interface system is not to provide a learning interface, but 
rather to teach the user about or conform the user to the dictates of the predetermined and 
invariant interface of the device. While many types of programmable devices are known to exist, 
normally, as provided in U.S. Patent No. 5,187,797, instructions are entered and executed in a 
predetermined sequence, with set branch points based on input conditions or the environment. 
See also U.S. Patent Nos. 4,878,179, 5,124,908, and 5,247,433. 

An aspect of the present invention provides a device having a predetermined or a generic 
style interface upon initial presentation to the user, with an adaptive progression in which 
specialized features become more easily available to a user who will likely be able to make use 
of them, while unused features are or remain "buried" within the interface. The interface also 
extracts behavioral information from the user and to alter the interface elements to optimize the 
efficiency of the user. 

A videocassette recorder is a ubiquitous example of a programmable device, and 
therefore forms the basis of much of the discussion herein. It should, of course, be realized that 
many of the aspects of the present invention could be applied by one of ordinary skill in the art to 
a variety of controls having human interfaces, and that these other applications are included 
within the scope of the present invention. 

The VCR apparatus typically involves a remote control entry device, and the interface of 
the present invention contains a graphical interface displayed for programming programmable 
devices. This aspect of the present invention seeks more accurate programming through the use 
of program verification to ensure that the input program is both valid and executable. Thus, it 
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has a mechanism to store and check to verify that there are no conflicting programs. An 
apparatus according to the present invention can be connected, for example, to any infrared 
programmable device in order to simplify the programming process. By way of example only, 
an improved VCR interface forms the basis of a disclosed example. It is, of course, realized that 
the present method and apparatus may be applied to any programmable controller, i.e., any 
device which monitors an event or sensor and causes an event when certain conditions or 
parameters are met, and may also be used in other programming environments, which are not 
event driven. While the present interface is preferably learning and adaptive, it may also detect 
events and make decisions based on known or predetermined characteristics. Where a number of 
criteria are evaluated for making a decision, conflicts among the various criteria are resolved 
based on a strength of an evaluated criteria, a weighting of the criteria, an interactivity function 
relating the various criteria, a user preference, either explicitly or implicitly determined, and a 
contextual analysis. Thus, a user override or preference input may be provided to assist in 
resolving conflicts. 

The present invention may incorporate an intelligent program recognition and 
characterization system, making use of any of the available cues, which allows an intelligent 
determination of the true nature of the broadcast and therefore is able to make a determination of 
whether parameters should be deemed met even with an inexact match to the specified 
parameters. Therefore, in contradistinction with VPV, the present invention provides, for 
example, intelligence. The VPV is much more like the "VCR Plus" device, known to those 
skilled in the art, which requires that a broadcast be associated with a predetermined code, with 
the predetermined code used as a criteria for initiating recording. Some problems with VCR Plus 
include identification of the codes which identify channel and time, post scheduling changes, 
incorrect VCR clock setting, and irregular schedules. VCR Plus also is limiting with respect to 
new technologies and cable boxes. 

The videotext signal of the prior art includes a digitally encoded text message that may be 
displayed in conjunction with the displayed image, similar to the closed caption system. The 
aforementioned West German system demonstrates one way in which the transmitted signal may 
be received by a device and interpreted to provide useful information other than the transmitted 
program itself. However, the prior art does not disclose how this signal may be used to index 
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and catalog the contents of a tape, nor does it disclose how this signal may be used to classify or 
interpret the character of the broadcast. In other words, in one embodiment of the present 
invention, the videotext or closed caption signal is not only interpreted as a literal label, as in the 
prior art, but is also further processed and analyzed to yield data about the content of the 
5 broadcast, other than merely an explicit identification of the simultaneously broadcast 
information. 

Beyond or outside the visible region of an U.S. National Television Standards Committee 
(NTSC) broadcast video frame are a number of scan lines which are dedicated to presenting 
digital information, rather than analog picture information. Various known coding schemes are 
10 available for transmitting and receiving information in this non-viewing portion of the video 
„ transmission, and indeed standard exist defining the content of these information fields. Of 
ffi course, various other transmission schemes provide a format for transmitting data. For example, 
yg standard frequency modulation (FM) transmissions may be associated with digital data 
— transmissions in a subcarrier. Likewise, satellite transmissions may include digital data along 
ji5 with an audio data stream or within a video frame, which may be in analog format or digitally 
s encoded. 

3 ; 

pi Cable systems may transmit information either in the broadcast band or in a separate 

JfH band. HDTV schemes also generally provide for the transmission of digital data of various sorts. 
O Thus, known audio and video transmission systems may be used, with little or no modifications 
~20 to provide enhanced functionality, according to the present invention. It is therefore possible to 
use known and available facilities for transmitting additional information relating to the 
broadcast information, in particular, the characteristics of the video broadcast, and doing so could 
provide significant advantages, used in conjunction with the interface and intelligent pattern 
recognition controller of the present invention. If this information were directly available, there 
25 would be a significantly reduced need for advanced image recognition functions, such advanced 
image recognition functions requiring costly hardware devices, while still maintaining the 
advantages of the present invention. 

It is noted, however, that the implementation of a system in which characterization data 
of the broadcast is transmitted along therewith might require a new set of standards and the 
30 cooperation of broadcasters, as well as possibly the government regulatory and approval 
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agencies. The present invention does not require, in all of its aspects, such standardization, and 
therefore may advantageously implement substantial data processing locally to the receiver. It is 
nevertheless within the scope of the invention to implement such a broadcast system with 
broadcast of characterization data in accordance with the present invention. Such broadcast 
characterization data may include characterizations as well as preprocessed data useful for 
characterizing according to flexible criteria in the local receiving device. 

According to the present invention, if such characterizations are broadcast, they may, as 
stated above, be in band or out of band, e.g., making use of unused available spectrum bandwidth 
within the NTSC channel space, or other broadcast system channel space, or may be "simulcast" 
on a separate channel, such as an FM sideband or separate transmission channel. Use of a 
separate channel would allow a separate organization, other than the network broadcasters, to 
provide the characterization data for distribution to users of devices that make use of the present 
intelligent system for controlling a VCR or other broadcast information processing device. Thus, 
the characterization generating means need not be directly linked to the local user machine in 
order to fall within the scope of the present invention. The present invention also provides a 
mechanism for copyright holders or other proprietary interests to be protected, by limiting access 
to information be encryption or selective encryption, and providing an accounting system for 
determining and tracking license or broadcast fees. 

Research has been performed relating to VCR usability, technology, implementation, 
programming steps, current technology, input devices, and human mental capacity. This 
research has resulted in a new paradigm for the entry of programming data into a sequential 
program execution device, such as a VCR, by casual users. 

Four major problems in the interfaces of VCRs were found to exist. The first is that users 
spend far too much time searching for necessary information, which is necessary in order to 
complete the programming process. Second, many people do not program the VCR to record at 
a later time (time-shift) frequently, and thus forget the programming steps in the interim, i.e., the 
inter-session decay of the learning curve is significant. Third, the number of buttons on many 
remote control devices has become overwhelming. Fourth, people have become reluctant to 
operate or program VCRs because of their difficult operation. It was found that, by minimizing 
the learning and searching times, the user's programming time and frustration level can be greatly 
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reduced. If VCRs are easier to program, users might program them more frequently. This would 
allow more efficiency and flexibility in broadcast scheduling, especially late night for time shift 
viewing. The present invention therefore provides an enhanced VCR programming interface 
having a simplified information structure, an intuitive operational structure, simplified control 
layout and enhanced automated functionality. 

A new class of consumer device has been proposed, which replaces the videotape of a 
traditional videotape recorder with a random-access storage device, such as a magnetic hard disk 
drive. Multimedia data is converted through a codec (if necessary), and stored in digital form. 
Such systems are proposed by Tivo, Inc., Philips Electronics (Personal TV), Replay Networks, 
Inc. and Metabyte, Inc. Some of these systems employ a user preference based 
programming/recording method similar to that of the present invention. 

In these systems, typically a content descriptive data stream formulated by human editors 
accompanies the broadcast or is available for processing and analysis. Based on a relation of the 
user preferences, which may be implied by actual viewing habits or input through simple 
accept/veto user feedback, selected media events may be recorded. However, such systems rely 
on a correspondence between the factors of interest to users and those encoded in the data stream, 
e.g., a "program guide". This is not always the case. However, where the available data 
describing the program maps reasonably well into the user preference space, such a system may 
achieve acceptable levels of performance, or stated otherwise, the program material selected by 
the system will be considered acceptable. 

One particular aspect of these time-shifting consumer media recording devices is how 
they deal with advertising materials that accompany program material. In many instances, the 
user seeks to avoid "commercials", and the device may be programmed to oblige. However, as 
such devices gain wider acceptance, advertisers will be reluctant to subsidize broadcasts. 
Therefore, an advertising system may be integrated into the playback device that seeks to 
optimize the commercial messages presented to a viewer. By optimizing the messages or 
advertisements, the viewer is more receptive to the message, and economic implications ensue. 
For example, a viewer may be compensated, directly or indirectly, for viewing the commercials, 
which may be closely monitored and audited, such as by taking pictures of the audience in front 
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of a "set-top box". The acquired data, including viewer preferences, may be transmitted back to 
commercial sponsors, allowing detailed demographic analysis. 

In order to ensure privacy, the preference information and/or images may be analyzed by 
a proxy, with the raw data separated from the commercial users of such data. Thus, for example, 
the particular users of a system may register their biometric characteristics, e.g., face. Thereafter, 
the imager captures facial images and correlates these with its internal database. The image itself 
therefore need not be stored or transmitted. Viewer preferences and habits, on the other hand, 
likely must be transmitted to a central processing system for analysis. 

Because the system is intelligent, copy protection and royalty accounting schemes may 
readily be implemented. Thus, broadcasters and content providers may encode broadcasts in 
such a way as to control the operation of the consumer device. For example, an IEEE-1394-type 
encryption key support (e.g., DTCP or XCA)/copy protection or DIVX scheme may be 
implemented. Further, certain commercial sponsors may be able to avoid deletion of their 
advertisement, while others may allow truncation. The acceptability of this to the consumer may 
depend on subsidies. In other words, a company is willing to pay for advertising. Instead of 
paying for placements directly to the media, a portion is paid to a service provider, based on 
consumer viewing. The media, on the other hand, may seek to adopt a pay-per-view policy, at 
least with respect to the service provider, in lieu of direct advertising revenues. The service 
provider will account to both advertisers and content providers for use. With sufficient viewing 
of commercials, the entire service charge for a system might be covered for a user. On the other 
hand, a viewer might prefer to avoid all commercials, and not get the benefit of a subsidy. The 
service provider performs the economically efficient function of delivering optimized, 
substituted commercials for the almost random commercials which flood the commercial 
broadcast networks, and thus can accrue greater profits, even after paying content providers a 
reasonable fee. An advertiser, by selecting a particular audience, may pay less than it would 
otherwise pay to a broadcaster. The content providers may also charge more for the privilege of 
use of their works. 

As stated above, the content may be copy protected by the use of encryption and/or 
lockout mechanisms. Thus, by providing an alternative to an analog VCR, a full end-to-end 
encrypted signal may be provided, such as that proposed for the IEEE-1394 copy protection 
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scheme. Because enhanced recording capabilities are provided to the consumer, the acceptance 
will be high. Because of the encryption, lack of portability and continued royalty accounting, 
content provider acceptance will also likely be high. 

IEEE 1394 provides for Digital Content Protection. See, Bill Pearson "1394 Digital 
Content Protection, Multimedia Systems Design , (11/98). Techniques such as encryption and 
authentication/key exchange maintain content quality without degradation while preventing 
unauthorized copying. The IEEE 1394 content protection system provides four elements of 
digital content protection: Copy control information (CCI); Authentication and key exchange 
(AKE); Content encryption; and System renewability. 

In an IEEE 1394 system, there are source devices and sink devices. The source device 
transmits a copy protection system stream of content. A source device is one that can send a 
stream of content and a sink device is one that can receive a stream of content. Multifunction 
devices such as PCs and record/playback devices such as digital VCRs can be both source and 
sink devices. The following is a step-by-step description of the interaction source and sink 
devices: The source device initiates the transmission of a stream of content marked with the 
appropriate copy protection status (e.g., "copy once, 1 ' "copy never," or "no more copies") via the 
EMI bits. Upon receiving the content stream, the sink device inspects the EMI bits to determine 
the copy protection status of the content. If the content is marked "copy never," the sink device 
requests that the source device initiate full AKE. If the content is marked "copy once" or "no 
more copies," the sink device will request full AKE if it is supported, or restricted AKE if it isn't. 
If the sink device has already performed the appropriate authentication, it can then proceed. 
When the source device receives the authentication request, it proceeds with the type of 
authentication requested by the sink device, unless full AKE is requested but the source device 
can only support restricted AKE, in which case restricted AKE is performed. Once the devices 
have completed the required AKE procedure, a content-channel encryption key (content key) can 
be exchanged between them. This key is used to encrypt the content at the source device and 
decrypt the content at the sink. 

The first element in the content protection scheme is the copy control information (CCI). 
CCI is a way for content owners to specify how their content can be used. Some examples are 
"copy never," "copy once," "no more copies," and "copy free." The content protection system is 
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capable of securely communicating copy control information between devices. Two different 
CCI mechanisms are supported and are discussed below. In the event that conflicting copy 
protection requirements are specified by the different mechanisms, sink devices should follow 
the most restrictive CCI available. Embedded CCI is carried as part of the content stream. Many 
content formats (including MPEG) have fields allocated for carrying the CCI associated with the 
stream. The integrity of the embedded CCI is ensured since tampering with the content stream 
results in erroneous decryption of the content. 

The encryption mode indicator (EMI) provides easily accessible yet secure transmission 
of CCI to bit stream recording devices (such as digital VCRs) that know nothing beyond the 
content. The EMI is placed in an easily accessible location. For 1394 buses, this location is the 
most significant two bits of the synch field of the isochronous packet header. Devices can then 
immediately determine the CCI of the content stream without needing to decode the content 
transport format to extract the embedded CCI. This ability is critical for enabling bit stream 
recording devices that do not recognize and cannot decode specific content formats. If the EMI 
bits are tampered with, the encryption and decryption modes will not match, resulting in 
erroneous decryption of the content. 

The proposed system is based on robust and accepted cryptographic techniques that have 
evolved over the past 20 years to serve critical military, governmental, and commercial 
applications. These techniques have been thoroughly evaluated by hackers and by legitimate 
cryptography experts, and have proven their ability to withstand attack. The robustness and 
cryptographic stability of the system are derived from the proven strength of the underlying 
technologies, rather than merely how well a certain algorithm can be kept secret. 

Before sharing valuable information, a connected device must first verify that another 
connected device is authentic. In an effort to balance the protection requirements of the film and 
recording industries with the real-world requirements of PC and CE users, the proposal includes 
a choice of two authentication levels, full and restricted. Full authentication can be used with all 
content protected by the system. Restricted authentication enables the protection of "copy-once" 
content only. 

The full authentication system employs the public key-based Digital Signature Standard 
(Dss) and Diffie-Hellman key exchange algorithms. Dss is a method for digitally signing and 
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verifying the signatures of digital documents to verify the integrity of the data. Both the Dss and 
Diffie-Hellman implementations for the proposed system employ elliptic curve cryptography. 
This technique offers superior performance compared to systems based on calculating discrete 
logarithms in a finite field. 
5 The next element of content protection is known as authentication and key exchange 

(AKE). Before sharing valuable information, a connected device must first verify that another 
connected device is authentic. To balance the protection requirements of the content industries 
and the real-world requirements of PC and CE users, the specification includes a choice of two 
authentication levels: full and restricted. Full authentication can be used with all content 
10 protected by the system. Restricted authentication enables the protection of "copy once" content 

* ° nly - 

^fj All compliant devices must be assigned a unique public/private key pair that is generated 

by the DTLA. The private key must be stored within the device in such a way as to prevent its 
disclosure. The preferred method of storing the key would be to use a highly integrated device, 
S|5 such as a microcontroller with built-in EPROM. Compliant devices must also be given a device 
!, certificate by the DTLA. This certificate is stored in the compliant device and used during the 
5r; authentication process. In addition, the compliant device will need to store the other constants 

3 a : 
« "sr 

O and keys necessary to implement the cryptographic protocols. Full authentication uses the public 
p key-based digital signature standard (DSS) and Diffie-Hellman (DH) key-exchange algorithms. 
Ho DSS is a method for digitally signing and verifying the signatures of digital documents to verify 
the integrity of the data. DH key exchange is used during full authentication to establish control- 
channel symmetric cipher keys, which allows two or more parties to generate a shared key. 
Developed more than 20 years ago, the algorithm is considered secure when it is combined with 
digital signatures to prevent a so-called "man-in-the-middle" attack. A man-in-the-middle attack 
25 is when one person places himself between two others who are communicating. He can imitate 
either of the participants, modify and delete messages, or generate new ones entirely. A shared 
key helps prevent this type of attack because each message contains a digital signature signed 
with the private key of the source. The receiver of the message can easily verify that the message 
came from the intended source. 
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The full authentication protocol begins when the sink device initiates the authentication 
protocol by sending a request to the source device. The first step of the full authentication 
procedure is for the devices to exchange device certificates. Next, they exchange random 
challenges. Then each device calculates a DH key-exchange first-phase value. The devices then 
exchange signed messages that contain the following elements: The other device's random 
challenge; and The DH key-exchange first-phase value; The renewability message version 
number of the newest system renewability message (SRM) stored by the device. The devices 
process the messages they receive by first checking the message signature using the other 
device's public key to verify that the message has not been tampered with. The device also 
verifies the integrity of the other device's certificate. If these signatures cannot be verified, the 
device refuses to continue. Each device also examines the certificate revocation list embedded in 
its SRM to verify that the other device's certificate has not been revoked. In addition, by 
comparing the exchanged renewability version numbers, devices can invoke the SRM upgrade 
mechanisms at a later time. If no errors have occurred during the authentication process, the two 
devices have successfully authenticated each other and established an authorization key. 

System renewability messages are not particularly used to disable source devices, but 
rather sink devices. For example, if a person manages to get a hold of a device ID for a digital 
television, and then modifies a digital VCR to have the device ID of his digital television. Then, 
when any device is talking with the modified digital VCR, it will think it is talking to a digital 
TV and will send data to the device, allowing a person to copy protected content. This is detected 
when a pirate device is discovered. Once one of these pirate devices is detected, they can all 
easily be disabled because they all share the same device ID. Once the device ID has been 
disabled, the SRM will propagate itself to other devices. Then, no legitimate device will allow 
protected content to be sent to the pirate device. The memory required for this function is limited 
to insure that it is reasonable to implement in low-cost consumer devices. 

Restricted authentication is used between source devices and sink devices for the 
exchange of "copy once" and "copy no more" contents. Devices that only support "copy once" 
and "copy no more" content such as digital VCRs typically have limited computational 
resources. Restricted authentication relies on the use of a shared secret and hash functions to 
respond to a random challenge. It is based on a device being able to prove that it holds a secret 
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shared with other devices. One device authenticates another by issuing a random challenge that is 
responded to by modifying it with the shared secret and multiple hashings. The restricted 
authentication protocol begins when the sink device initiates the authentication protocol by 
sending a request to the source device. The source device then requests the device ID of the sink 
device. After receiving the device ID, the source device generates a random challenge and sends 
it to the sink device. After receiving a random challenge back from the source device, the sink 
device computes a response using its license key (assigned by the DTLA and a function of the 
device ID and service key) and sends it to the source. After the sink device returns a response, 
the source device compares this response with similar information generated at the source side 
using its service key and the ID of the sink device. If the comparison matches its own 
calculation, the sink device has been verified and authenticated. The source and sink devices then 
each calculate an authorization key. 

The following steps are common to both full and restricted authentication. The source 
device generates a random number for an exchange key, scrambles it using its calculated 
authorization key, and sends it to the sink device. The sink device then descrambles the exchange 
key using its own calculation of the authorization key. This exchange key can be repeatedly used 
to set up and manage the security of copyrighted content streams without further authentication. 

The cipher used to encrypt the content must be robust enough to protect the content, yet 
efficient enough to implement on a variety of platforms. To ensure interoperability, all 
compliant devices must support the baseline cipher and possibly additional, optional ciphers for 
protecting the content. Ciphers can be used in the converted-cipher block-chaining mode. Cipher 
block-chaining is a technique that adds feedback into the input of the cipher. Converted-cipher 
block-chaining provides greater security than ordinary cipher block-chaining by using secretly 
converted ciphertext (ciphertext is the output of a cipher — plaintext in, ciphertext out) as 
feedback rather than known ciphertext on a public channel. Therefore, known-plaintext attacks 
and key-exhaustive searches become more difficult. 

The M6 cipher is tentatively selected as the baseline cipher while DES, Blowfish, and 
others can be used as optional ciphers. The M6 cipher is a common-key block-cipher algorithm 
based on permutation-substitution. It is a rotation-based algorithm like Hitachi's MULTI2 
encryption algorithm currently used as an encryption standard for a Japanese digital satellite 
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broadcasting system. The M6 cipher is simpler than MULTI2 and uses the same type of 
algorithm seen in hash functions MD5 and SHA-1 that have shown their ability to withstand 
cryptographic attack. 

Devices that support full authentication can receive and process SRMs that are created by 
the DTLA and distributed with content. System renewability is used to ensure the long-term 
system integrity by revoking the device IDs of compromised devices. SRMs can be updated 
from other compliant devices that have a newer list, from media with prerecorded content, or via 
compliant devices with external communication capability (i.e., over the Internet, phone lines, 
cable system, or network). There are several components of an SRM. Some of the most 
important are: A monotonically increasing system renewability version number is used to ensure 
that only the newest message is used, and is essentially a counter that increases but never 
decreases. A certificate revocation list (CRL) is used to revoke the certificates of devices whose 
security has been compromised. Some devices may have limited nonvolatile memory available to 
store the CRL and thus may only support a subset of the list. Therefore, the entries in the CRL 
should be ordered according to their perceived threat to content. This will ensure that entries for 
devices that are the greatest threat to content can be stored by compliant devices that support 
certificate revocation, but only have limited storage space for SRMs. A DTLA signature (a value 
calculated using the DTLA private key) of these components, which is used to ensure the 
integrity of the SRM. 

The version number of a new SRM is examined. If the message is newer than the current 
information, the system verifies the integrity of the message. If the message is valid and intact, 
then the system updates its information. The system may revoke a device authorization, based 
on the SRM. First, the set-top box (STB) receives updated SRM with a particular device ID on 
its CRL. The STB then passes the SRM to the digital TV (DTV) when the next cable movie is 
watched. The DTV passes the SRM on to the DVD player when the next DVD movie is watched. 
Once all devices in the current environment have received the SRM, that device ID is fully 
revoked. 

The user interface concepts according to the present invention are easily applied to other 
special purpose programmable devices, and also to general-purpose programmable devices 
wherein the programming paradigm is event-driven, as well as other programming systems. It 
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should also be noted that it is within the scope of the present invention to provide an improved 
interface and programming environment for all types of programmable devices, and in this 
regard, the present invention incorporates adaptive features that optimize the programming 
environment for both the level of the user and the task to be programmed. 

In optimizing the interface, four elements are particularly important: the input device, the 
display format, the sequence of the programming operation, and the ability of the device to 
properly interpret the input as the desired program sequence. 

The present invention proceeds from an understanding that an absence of user frustration 
with respect to a programmable consumer or industrial device or interface, may be particularly 
important with respect to achieving the maximum potential functionality thereof. The interface 
must be designed to minimize the user's frustration level. This can be accomplished by clearly 
furnishing the possible choices, presenting the data in a logical sequence, and leading the user 
through the steps necessary to program the device. 

When applied to other than audiovisual and/or multimedia application, the pattern 
recognition function may be used to control the execution of a program or selectively control 
execution of portions of the software. For example, in a programmable temperature controller 
application, a sensor or sensor array could be arranged to detect a "door opening". On the 
occurrence of the door opening, the system would recognize this pattern, i.e. a mass of air at a 
different temperature entering the environment from a single location, or a loss of climate 
controlled air through a single location. In either event, the system would take appropriate 
action, including: halt of normal climate control and impose a delay until the door is closed; 
after closure, set a time constant for maintenance of a steady state of the replaced air with the 
climate controlled air; based on the actual climatic condition after assimilation, or a predicted 
climatic condition after assimilation, begin a climate compensation control; optionally, during 
the door opening, control a pressure or flow of air to counterbalance the normal flow through the 
door, by using a fan or other device. The climate may differ in temperature, humidity, pollutants, 
or the like, and appropriate sensors may be employed. 

The present invention also allows a dynamic user preference profile determination based 
on explicit or implicit desires, e.g., moods, which assist in processing data to make decisions 
which conform to the user preference at a given point in time. For example, voice patterns, skin 
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temperature, heat pulse rate, external context, skin resistance (galvanic skin response), blood 
pressure, stress, as determined by EMG, EEG or other known methods, spontaneous motor 
activity or twitching, may be detected in order to determine or infer a user mood, which may be 
used as a dynamic influence on the user preference. These dynamic influences are preferably 
stored separately from static influences of the preferences, so that a resultant determined 
preference includes a dynamic influence based on a determined mood or other temporally 
varying factor and a static influence associated with the user. 

When a group of people are using the system simultaneously, the system must make a 
determination of a composite preference of the group. In this case, the preferences of the 
individuals of the group, if known, may be correlated to produce an acceptable compromise. 
Where individual preferences are not a priori known, individual or group "interviews" may be 
initially conducted to assist in determining the best composite group preference. 

It is therefore an object according to the present invention to provide a radio receiver or 
video receiver device, having a plurality of different available program sources, determining a 
program preference for one or more individuals subject to a presented program, comparing the 
determined program preference and a plurality of different program sources, and selects at least 
one program based on the comparison. 

In formulating a group preference, individual dislikes may be weighted more heavily than 
likes, so that the resulting selection is tolerable by all and preferable to most group members. 
Thus, instead of a best match to a single preference profile for a single user, a group system 
provides a most acceptable match for the group. It is noted that this method is preferably used in 
groups of limited size, where individual preference profiles may be obtained, in circumstances 
where the group will interact with the device a number of times, and where the subject source 
program material is the subject of preferences. Where large groups are present, demographic 
profiles may be employed, rather than individual preferences. Where the device is used a small 
number of times by the group or members thereof, the training time may be very significant and 
weigh against automation of selection. Where the source material has little variety, or is not the 
subject of strong preferences, the predictive power of the device as to a desired selection is 
limited. 
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The present invention provides a system and method for making use of the available 
broadcast media forms for improving an efficiency of matching commercial information to the 
desires and interests of a recipient, improving a cost effectiveness for advertisers, improving a 
perceived quality of commercial information received by recipients and increasing profits and 
reducing required information transmittal by publishers and media distribution entities. 

This improved advertising efficiency is accomplished by providing a system for collating 
a constant or underlying published content work with a varying, demographically or otherwise 
optimized commercial information content. This commercial information content therefore need 
not be predetermined or even known to the publisher of the underlying works, and in fact may be 
determined on an individual receiver basis. It is also possible to integrate the demographically 
optimized information within the content. For example, overlays in traditional media, and 
electronic substitutions or edits in new media, may allow seamless integration. The content 
alteration need not be only based on commercial information, and therefore the content may vary 
based on the user or recipient. 

U.S. Patent No. 5,469,206, expressly incorporated herein by reference, relates to a system 
that automatically correlates user preferences with electronic shopping information to create a 
customized database for the user. 

Therefore, the granularity of demographic marketing may be very fine, on a receiver-by- 
receiver basis. Further, the accounting for advertisers will be more accurate, with a large sample 
and high quality information. In fact, in a further embodiment, an interactive medium may be 
used allowing immediate or real time communication between recipient and advertiser. This 
communication may involve the Internet, private networks or dial-up connections. Because the 
commercial messages are particularly directed to recipients, communication with each selected 
recipient is more valuable to an advertiser and that advertiser is willing to pay more for 
communication with each selected recipient. Recipients may therefore be selected to receive the 
highest valued appropriate commercial message(s). Thus, advertisers will tend to pay less and 
media producers will gain more revenues. Recipients will gain the benefit of selected and 
appropriate media, and further, may provide feedback for determining their preferences, which 
will likely correspond with their purchasing habits. Thus, the recipient will benefit by receiving 
optimized information. 
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Likewise, a recipient may place a value on receiving certain information, which forms the 
basis for "pay-per-view" systems. In this case, the recipient's values may also be considered in 
defining the programming. 

This optimization is achieved by providing a device local to the recipient which 
selectively presents commercial information to the recipient based on characteristics individual 
to the recipient, which may be input by the recipient, the publisher, the advertiser, and/or learned 
by the system based on explicit or implicit feedback. The local device either has a local memory 
for advertising materials, or a telereception link for receiving commercial information for 
presentation, either on a real time basis or stored for later presentation. In a further embodiment, 
a user may control the content and/or commercial information received. In this case, the 
accounting system involves the user's account, and, for example, the recipient may be denied the 
subsidy from the commercial advertiser, and pay for the privilege of commercial free content. 

It is also possible to employ the methods and systems according to the present invention 
to create a customized publication, which may be delivered physically to the recipient, for 
example as print media, facsimile transmission, e-mail, R-CD-ROM, floppy disk, or the like, 
without having a device local to the consumer. 

It is noted that this system and method is usable for both real time media, such as 
television, radio and on-line telecommunication, as well as manually distributed periodicals, such 
as newspapers, magazines, CD-ROMs, diskettes, etc. Therefore, the system and method 
according to the present invention includes a set of related systems with varying details of 
implementation, with the underlying characteristic of optimization of variable material 
presentation at the recipient level rather than the publisher level. 

The system and method according to the present invention preferably includes an 
accounting system which communicates information relating to receipt of commercial 
advertising information by a recipient to a central system for determination of actual receipt of 
information. This feedback system allows verification of receipt and reduces the possibility of 
fraud or demographic inaccuracies. 

The accounting system, for example, may place value on the timeslot, associated content, 
the demographics of the user, user's associated valuation, competition for placement, past history 
(number of impressions made to same recipient) and exclusivity. 
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A preferred embodiment includes a subscription television system having a plurality of 
received channels. At least one of these channels is associated with codes to allow determination 
of content from variable segments. It is also possible to identify these variable segments without 
these codes, although the preferred system includes use of such codes. These codes also allow 
simple identification of the content for accounting purposes. Upon detection of a variable 
segment, a commercial advertisement is selected for presentation to the recipient. This variable 
segment is selected based on the characteristics of the recipient (s), the history of use of the 
device by the recipient (s), the context of use, the arrangements made by the commercial 
information provider(s) for presentation of information, and the availability of information for 
presentation. Other factors may include the above-mentioned accounting system factors. 
Typically, the local device will include a store of commercial information, downloaded or 
otherwise transmitted to the recipient (e.g., a CD-ROM or DVD with MPEG-2 compressed 
images). A telecommunication link may also be provided to control the process, provide 
parameters for the presentation or the information itself. This telecommunication link may be 
provided through the public telephone network, Internet, private network (real or virtual) cable 
network, or a wireless network, for example. Generally, the underlying work will have a gap of 
fixed length, so that the commercial information must be selected to fit in this gap. Where the 
gap is of variable length, such as might occur in live coverage, the commercial information is 
interrupted or the underlying work buffered and delayed to prevent loss. Thus, the presentation 
to the user is constructed from pieces, typically at the time of presentation, and may include 
invariable content, variable content, invariable messages, variable messages, targeted content 
and/or messages, and hypervariable content. Hypervariable content includes, for example, 
transition material selected based on the stream of information present, and other presentations 
which my optionally include useful information which are individualized for the particular 
recipient or situation. 

According to another embodiment, a recording, such as on a videotape, is retained by a 
recipient which includes proprietary content. This may include a commercial broadcast, a private 
broadcast, or distributed media. In the case of a commercial broadcast, some or all of the 
commercial advertising or other time-sensitive information is old and/or stale. Therefore, in 
operation, this old or time sensitive information is eliminated and substituted with new and/or 



Hoffberg et al. 



-92- 



LIH-13 



different information. Thus, the presentation system freshens the presentation, editing and 
substituting where necessary. 

By such a method, content distributed even through private channels may include 
advertisements, and thus be subsidized by advertisers. The advertisements and other added 
5 content are generally more acceptable to the audience because they are appropriately targeted. 
For example, where the broadcaster has a high degree of control over the initial 
broadcast, e.g., pay per view under license, or where the broadcaster may claim substantial 
continuing rights in the work after recording, the enforcement of a proprietary replay system may 
be accepted. For example, a work is broadcast as an encrypted digital data stream, with selective 
10 decryption at the recipient's receiver, under license from the broadcaster. In this case, a 

recording system is provided which retains the encryption characteristics, ensuring the integrity 
C of the accounting process. During presentation of the recorded work, commercial information is 
yb appropriately presented to the recipient during existing or created gaps, or in an associated output 
f=i separate from the content presentation. The recipient, as a result, receives the benefit of the 
.j5 original subsidy, or may receive a new subsidy. 

l_ Therefore, similar to the known DIVX system, an encrypted media may be mass 

fy distributed, which requires authorization for display. Instead, however, of requiring the recipient 

to pay for the initial and subsequent displays of the content, the player integrates advertising 
^ content into the output, which may vary based on the audience, time and past history, as well as 
20 other factors discussed herein. Given the interactive and variable nature of the presentation, the 
user or audience may even veto ("fast forward through") a particular commercial. In this case, 
the use may have to account for a fee, or other advertisers may tack up the slack. The veto 
provides information regarding the desires of the viewer, and may be used to help select future 
messages to the displayed or presented. 
25 According to another embodiment, a radio transmission/reception system is provided 

which broadcasts content, an overlay track and variable commercial information. The invariant 
works are preferably prerecorded music. The overlay track is preferably a "DJ", who provides 
information regarding the invariant works, commercial information or news. The commercial 
information in this instance therefore refers to prerecorded segments. In this instance, the goal is 
30 to allow the invariant works to be received by the recipient and presented with improved 
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optimization of the commercial information content and other messages presented at the time of 

output. Further, this system allows optimization of the presentation of the invariant portions as 

well, i.e., the commercial information and the program content may be independently selected at 

the receiver, with appropriate accounting for commercial subsidy. In a mobile receiver, it is 

5 preferable to include as a factor in the selection of commercial information a location of the 

receiver, as might be obtained from a GPS system, cellular location system, intelligent highway 

system or the like. This would allow geographically appropriate selection of commercial 

information, and possibly overlay information as well, e.g., traffic reports. 

Another embodiment according to the present invention provides a hypertext linked 

10 media or multimedia environment, such as HTML/World Wide Web, wherein information 

n transmitted and/or displayed is adaptively selected based on the particular user or the user's 

receiving system. Thus, various elements may be dynamically substituted during use. 

yD Therefore, it is an object according to the present invention to provide adaptive man- 

q machine interfaces, especially computer graphic user interfaces, which are economically 

J5 improved to provide an optimized environment. Productivity of computer operators is limited by 

^ the time necessary to communicate a desired action through the user interface to the device. To 
o 

ft! reduce this limitation, most likely user actions are predicted and presented as easily available 
n\ options. The technologies also extend beyond this core theme in many differing ways, 
Jr; depending on the particular application. 
20 The system also provides an intelligent, adaptive pattern recognition function in order to 

provide the operator with a small number of high probability choices, which may be complex, 

without the need for explicit definition of each atomic instruction comprising the desired action. 

The interface system predicts a desired action based on the user input, a past history of use, and a 

context of use. 

25 In yet another embodiment, a present mood of a user is determined, either explicitly or 

implicitly, and the device selects program material that assists in a desired mood transition. The 
operation of the device may additionally acquire data relating to an individual and the respective 
moods, desires and characteristics, altering the path provided to alter the mood based on the data 
relating to the individual. As stated above, in a group setting, a most acceptable path is presented 

30 rather than a most desirable path as presented for an individual. 



Hoffberg et al. 



-94- 



LIH-13 



In determining mood, a number of physiologic parameters may be detected. In a training 
circumstance, these set of parameters are correlated with a temporally associated preference. 
Thus, when a user inputs a preference into the system as feedback, mood data is also obtained. 
Invariant preferences may be separated, and analyzed globally, without regard for temporal 
variations, while varying preferences are linked with information regarding the surrounding 
circumstances and stored. For example, the preference data may be used to train a neural 
network, e.g., using backpropagation of errors or other known methods. The inputs to the neural 
network include available data about surrounding context, such as time, environmental 
brightness, and persons present; source program choices, which may be raw data, preprocessed 
data, and abstracted data; explicit user input; and, in this embodiment, mood parameters, which 
may be physiological or biometric data, voice pattern, or implicit inputs. An example of an 
implicit input is an observation of a man-machine interaction, such as a video game. The manner 
in which a person plays a video game or otherwise interacts with a machine may provide 
valuable data for determining a mood or preference. 

According to one embodiment of the invention, the image is preprocessed to decompose 
the image into object-elements, with various object-elements undergoing separate further 
processing. For example, certain backgrounds may be aesthetically modeled using simple fractal 
equations. While, in such circumstances the results may be inaccurate in an absolute sense, they 
may be adequate in a performance sense. Faces, on the other hand, have common and variable 
elements. Therefore, a facial model may be based on parameters having distinguishing power, 
such as width between eyes, mouth, shape of ears, and other proportions and dimensions. Thus, 
along with color and other data, a facial image may be stored as a reference to a facial model 
with the distinguishing parameters for reconstruction. Such a data processing scheme may 
produce a superior reconstructed image and allow for later recognition of the face, based on the 
stored parameters in reference to the model. Likewise, many different elements of an image may 
be extracted and processed in accordance with specific models to produce differentiating 
parameters, wherein the data is stored as a reference to the particular model along with the 
particular data set derived from the image. Such a processing scheme allows efficient image 
storage along with ease of object recognition, i.e., distinction between objects of the same class. 
This preprocessing provides a highly asymmetric scheme, with a far greater processing 
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complexity to initially process the image than to subsequently reconstruct or otherwise later 
employ the data. 

By employing a model-based object extraction system, the available bandwidth may be 
efficiently used, so that objects which fall within the scope of an available model may be 
5 identified with a model identification and a series of parameters, and objects not within the scope 
of a model may be allocated a comparatively greater bandwidth for general image description, 
e.g., JPEG, MPEG-l/MPEG-2, wavelet, standard fractal image compression (FIC), or other 
image processing schemes. In a worst case, therefore, the bandwidth required will be only 
slightly greater than that required for a corresponding standard method, due only to the additional 
10 overhead to define data types, as necessary. However, by employing a model based-object 
f*>i decomposition processing system, recognized elements may be described using only a small 
^ amount of data and a greater proportion of data used to describe unrecognized elements. Further, 
*G the models available may be dynamically updated, so that, as between a communicating 
O transmitted and receiver, retransmission of unrecognized elements will be eliminated as a model 
J 5 is constructed. 

* Where image processing systems may produce artifacts and errors, an error minimization 

T : 

fy function may also be provided which compares an original image with a decomposed- 
S) recomposed image and produces an error function which allows correction for these errors. This 
error function may be transmitted with the processed data to allow more faithful reproduction. In 

La' 

20 a pattern recognition context, the error function may provide useful data relating to the reliability 
of a pattern correlation, or may provide useful data outside of the model and associated 
parameters for pattern recognition. 

Thus, in the case of an object-extraction model-based processing system, the resulting 
data stream may be appropriate for both viewing and recognition. Of course, acoustic data may 

25 be likewise processed using acoustic models with variable parameters. However, in such a 

system, information for pattern recognition may be filtered, such as eliminating the error function 
or noise data. Further, certain types of objects may be ignored, for example, under normal 
circumstances, clouds in the sky provide little information for pattern recognition and may be 
removed. In such a system, data intended for viewing or listening will likely contain all objects 
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in the original data stream, with as much original detail as possible given data storage and 
bandwidth constraints. 

An object extraction model based processing system also allows for increased noise 
rejection, such as over terrestrial broadcast channels. By transmitting a model, the receiving 
system may interpolate or extrapolate data to fill in for missing data. By extrapolate, it is meant 
that past data is processed to predict a subsequent condition. By interpolate, it is meant that data 
presentation is delayed, and missing data may therefore be predicted from both past and 
subsequent data transmission. Missing portions of images may also be reconstructed from 
existing portions. This reconstruction process is similar to that described in U.S. Pat. No. 
5,247,363, to reconstruct MPEG images; except that where model data is corrupted, the 
corruption must be identified and the corrupt data eliminated and replaced with predicted data. 

It is therefore an object according to the present invention to provide a programmable 
control, having a status, responsive to an user input and a signal received from a signal source, 
comprising a controller, for receiving the user input and the signal and producing a control 
output; a memory for storing data relating to an activity of the user; a data processing system for 
adaptively predicting a most probable intended action of the user based on the stored data 
relating to the activity of the user and derived weighing of at least a subset of possible choices, 
the derivation being based on a history of use, a context of a respective choice and the status of 
the control; and a user feedback data presenting system comprising an output device for 
presentation of a variable sequence of programming options to the user, including the most 
probable intended action of the user, in a plurality of output messages, the output messages 
differing in available programming options. 

The programmable control may be employed for performing an action based on user 
input and an information content of a signal received from a signal source, wherein the output 
device includes a display device, further comprising a user controlled direct manipulation-type 
input device, associated with the display device, having a device output, the device output being 
the user input; a plant capable of performing the action, being responsive to an actuator signal; 
and the controller, being for receiving data from the device output of the input device and the 
signal, and displaying user feedback data on the display device, the logical sequence of the user 
feedback data including at least one sequence of options sufficient to define an operable control 
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program, and a presentation of additional programming options if the control program is not 
operable. 

The programmable control may further comprise a user input processing system for 
adaptively determining a viewer preference based on the user input received by the controller; a 
program material processing system for characterizing the program material based on its content; 
a correlator for correlating the characterized content of the program material with the determined 
viewer preference to produce a correlation index; and a processor, selectively processing the 
program material based on the correlation index, the data processing system receiving an input 
from the processor. 

It is noted that a metadata stream associated with the content may be employed to 
characterize the content, relieving the receiver or client device from the need for characterizing 
the content. This metadata may be structured or unstructured. The metadata and data relating to 
the use or consumption of the content is then used to determine or update the user profile. It is 
noted that the content may be of any type, and therefore need no be video or multimedia. In the 
case of a structured metadata, the updating of the user profile may include a simple time- 
weighted decay (e.g., a simple infinite impulse response filter with exponential decay or diurnal 
variations, or other type) for correlation with future metadata records, or a more complex 
algorithm. 

The programmable control may also comprise a plurality of stored profiles, a processor 
for characterizing the user input to produce a characterized user input; and means for comparing 
the characterized user input with at least one of the plurality of stored profiles to produce a 
comparison index, wherein the variable sequence of programming options is determined on the 
basis of the comparison index. The processor for characterizing may perform an algorithm on 
the signal comprising a transform selected from the group consisting of an Aff ine transformation, 
a Fourier transformation, a discrete cosine transformation and a wavelet transformation. 

It is a further object according to the present invention to provide a programmable 
controller for controlling a recording device for recording an analog signal sequentially on a 
recording medium having a plurality of uniquely identifiable storage locations, further 
comprising a sequential recording device for recording the analog signal, and a memory for 
storing, in a directory location on the recording medium which is separate from the storage 
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location of the analog signal, information relating to the signal, processed to selectively retain 
characterizing information, and an identifier of a storage location on the recording medium in 
which the analog signal is recorded. 

It is another object according to the present invention to provide a control, wherein 
program material is encrypted, further comprising a decryption system for decrypting the 
program material if it is selected to produce unencrypted program material and optionally an 
associated decryption event; a memory for storing data relating to the occurrence of the 
decryption event; and a central database for storing data relating to the occurrence of the 
decryption event in association with data relating to the viewer. 

It is still another object according to the present invention to provide a control wherein 
the user input processing system monitors a pattern of user activity and predicts a viewer 
preference; the program material processing system comprising a processor for preprocessing the 
program material to produce a reduced data flow information signal substantially retaining 
information relating to the abstract information content of the program material and selectively 
eliminating data not relating to the abstract information content of the program material and for 
characterizing the information signal based on the abstract information content; and a comparing 
system for determining if the correlation index is indicative of a probable high correlation 
between the characterization of the information signal and the viewer preference and causing the 
stored program material to be processed by the processing means based on the determination. 
The system according to this aspect of the present invention preferably comprises an image 
program material storage and retrieval system. 

The present invention further provides a control further comprising a memory for storing 
a characterization of the program material; an input for receiving a feedback signal from the 
viewer indicating a degree of agreement with the correlation index determination, wherein the 
feedback signal and the stored characterization are used by the viewer preference predicting 
means to predict a new viewer preference. 

According to another aspect of the invention, it is an object to provide an image 
information retrieval apparatus, comprising a memory for storing compressed data representing a 
plurality of images; a data storage system for retrieving compressed data representing at least one 
of the plurality of images and having an output; a memory for storing characterization data 
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representing a plurality of image types, having an output; and an image processor, receiving as 
inputs the outputs from the data storage system and the characterization data memory, and 
producing a signal corresponding to a relation between at least one of the plurality of images of 
the compressed data and at least one of the image types of the characterization data. 
5 It is a still further aspect of the present invention to provide a video interface device for a 

user comprising a data transmission system for simultaneously transmitting data representing a 
plurality of programs; a selector for selecting at least one of the plurality of programs, being 
responsive to an input; a program database containing information relating to the plurality of 
programs, having an output; a graphical user interface for defining commands, comprising (a) an 
10 image display device having at least two dimensions of display, being for providing visual image 
p., feedback; and (b) a multidimensional input device having at least two dimensions of operability, 
adapted to correspond to the two dimensions of the display device, and having an output, so that 

yE the user may cause the input device to produce a corresponding change in an image of the display 

Si 

p device by translating an indicator segment of the display in the at least two dimensions of 

yl5 display, based on the visual feedback received from the display device, the indicator segment 

^ being moved to a translated location of the display device corresponding to a user command; and 

5 : 
t=ca ! 

fU a controller for controlling the graphical user interface and for producing the input of the 

D 

p\ selector, receiving as a control the output of the multidimensional input device, the controller 

■ass. 

~ receiving the output of the program database and presenting information relating to at least one 
20 of the plurality of programs on the display device associated with a command, the command 
being interpreted by the control means as the user command to produce the input of the selector 
to select the at least one of the plurality of programs associated with the command. 

Another object of the present invention is to provide an apparatus, receiving as an input 
from a human user having a user characteristic, comprising an input device, producing an input 
25 signal from the human user input; a display for displaying information relating to the input from 
the user and feedback on a current state of the apparatus, having an alterable image type; an input 
processor for extracting an input instruction relating to a desired change in a state of the 
apparatus from the input signal; a detector for detecting one or more temporal-spatial user 
characteristics of the input signal, independent of the input instruction, selected from the group 
30 consisting of a velocity component, an efficiency of input, an accuracy of input, an interruption 
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of input and a high frequency component of input; a memory for storing data related to the user 
characteristics; and a controller for altering the image type based on the user characteristics. The 
controller may alter the image type based on an output of the detector and the stored data so that 
the display displays an image type that corresponds to the detected user characteristics. The 
controller may further be for controlling the causation of an action on the occurrence of an event, 
further comprising a control for receiving the input instruction and storing a program instruction 
associated with the input instruction, the control having a memory sufficient for storing program 
instructions to perform an action on the occurrence of an event; and a monitor for monitoring an 
environment of the apparatus to determine the occurrence of the event, and causing the 
performance of the action on the occurrence of the event. The controller may also alters the 
image type based on an output of the detector and the stored data so that the display means 
displays an image type which corresponds to the detected user characteristics. 

It is another object of the present invention to provide an adaptive programmable 
apparatus having a plurality of states, being programmable by a programmer and operating in an 
environment in which a plurality of possible events occur, each of the events being associated 
with different data, comprising an data input for receiving data; an programmer input, producing 
an input signal from the programmer; a memory for storing data relating to the data input or the 
input signal; a feedback device for adaptively providing information relating to the input signal 
and a current status of the apparatus to the programmer, based on the data input or the 
programmer input, the stored data, and derived weighing of at least a subset of possible choices, 
the derived weighing being based on a history of use, a context of a respective choice and the 
current status of the apparatus; a memory for storing programming data associated with the input 
signal; and a processor, having a control output, for controlling the response of the apparatus 
relating to the detection of the input signal or the data in accordance with the stored 
programming data, the processor: (a) processing the at least one of the input signal or the data to 
reduce an amount of information while substantially retaining an abstract portion of the 
information; (b) storing a quantity of the abstracted information; (c) processing the abstract 
portion of the information in conjunction with the stored quantity of abstracted information; and 
(d) providing the control output based on the processed abstract portion of the information and 
the stored programming data. The apparatus may further comprise an input for receiving a 
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programming preference from the programmer indicating a plurality of possible desired events; 
the processor further including a correlator for correlating the programming preference with the 
data based on an adaptive algorithm and for determining a likelihood of occurrence of at least 
one of the desired events, producing the control output. The apparatus may further comprise an 
input for receiving feedback from the programmer indicating a concurrence with the control 
output of the processor, and modifying the response control based on the received feedback to 
increase a likelihood of concurrence. The apparatus may still further verify the programming 
data to ensure that the programming data comprise a complete and consistent set of instructions; 
and include a feedback system for interactively modifying the programming data. The apparatus 
may also comprise a chronological database and an accessing system for accessing the 
chronological database on the basis of the programming data stored in the memory. 

It is also an object according to the present invention to provide an apparatus comprising 
an input for receiving a programming preference from the programmer indicating a plurality of 
possible desired events; and a correlator for correlating the programming preference with the data 
based on an adaptive algorithm and for determining a likelihood of occurrence of at least one of 
the desired events, producing the output, the output being associated with the initiation of the 
response. 

The present invention also provides as an object an apparatus comprising an input for 
receiving feedback from the programmer indicating a concurrence with the output of the 
correlator, and modifying the algorithm based on the received feedback, the feedback device 
comprising a display and the input device is remote from the display, and providing a direct 
manipulation of display information of the display. 

According to an aspect of the present invention, a processor of the programmable 
apparatus verifies the program instructions to ensure that the program instructions are valid and 
executable by the processor; an output for providing an option, selectable by the programmer 
input for changing an instruction stored by the processor, such that the apparatus enters a state 
wherein a new instruction may be input to substitute for the instruction, wherein the processor 
verifies the instructions such that the instructions are valid; and wherein the feedback device 
further presents information requesting confirmation from the programmer of the instructions 
associated with the input signal. The apparatus may further comprise a chronological database 
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and an accessing system for accessing the chronological database on the basis of the program 
instructions stored in the memory. 

The processor of the programmable apparatus may receive information from the input 
signal and/or from the data input; and may further comprise an input signal memory for storing 
at least a portion of the input signal or the data, a profile generator for selectively generating a 
profile of the input signal or the data, and an input signal profile memory for storing the profile 
of the input signal or the data separately from the input signal or the data in the input signal 
memory. The programmable apparatus may further comprise a processor for comparing the 
input signal or the data with the stored profile of the input signal or the data to determine the 
occurrence of an event, and the data optionally comprises image data and the processor for 
comparing performs image analysis. The image data may comprise data having three associated 
dimensions obtained by a method selected from the group consisting of synthesizing a three 
dimensional representation based on a machine based model derived from two dimensional 
image data, synthesizing a three dimensional representation derived from a time series of pixel 
images, and synthesizing a three dimensional representation based on a image data representing a 
plurality of parallax views each having at least two dimensions. 

A user feedback data presenting device according to the present invention may comprise 
a display having a plurality of display images, the display images differing in available 
programming options. 

According to another aspect of the present invention, a program material processing 
system is provided comprising means for storing template data; means for storing the image data; 
means for generating a plurality of domains from the stored image data, each of the domains 
representing different portions of the image information; means for creating, from the stored 
image data, a plurality of addressable mapped ranges corresponding to different subsets of the 
stored image data, the creating means including means for executing, for each of the mapped 
ranges, a procedure upon the one of the subsets of the stored image data which corresponds to the 
mapped range; means for assigning identifiers to corresponding ones of the mapped ranges, each 
of the identifiers specifying for the corresponding mapped range an address of the corresponding 
subset of stored image data; means for selecting, for each of the domains, the one of the mapped 
ranges which most closely corresponds according to predetermined criteria; means for 
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representing at least a portion of the image information as a set of the identifiers of the selected 
mapped ranges; and means for selecting, from the stored templates, a template which most 
closely corresponds to the set of identifiers representing the image information. The means for 
selecting may comprise means for selecting, for each domain, the mapped range which is the 
most similar, by a method selected from at least one of the group consisting of selecting a 
minimum Hausdorff distance from the domain, selecting the highest cross-correlation with the 
domain and selecting the lowest mean square error of the difference between the mapped range 
and the domain. The means for selecting may also comprise, for each domain, the mapped range 
with the minimum modified Hausdorff distance calculated as D[db,mrb] + D[l - db,l - mrb], 
where D is a distance calculated between a pair of sets of data each representative of an image, 
db is a domain, mrb is a mapped range, 1 - db is the inverse of a domain, and 1-mrb is an inverse 
of a mapped range. The means for representing may further comprise means for determining a 
feature of interest of the image data, selecting a mapped range corresponding to the feature of 
interest, storing the identifiers of the selected mapped range, selecting a further mapped range 
corresponding to a portion of image data having a predetermined relationship to the feature of 
interest and storing the identifiers of the further mapped range. 

According to an embodiment of the present invention, the image data comprises data 
having three associated dimensions obtained by a method selected from the group consisting of 
synthesizing a three dimensional representation based on a machine based prediction derived 
from two dimensional image data, synthesizing a three dimensional representation derived from 
a time series of pixel images, and synthesizing a three dimensional representation based on a 
image data representing a plurality of parallax views having at least two dimensions. 

It is therefore an object of the present invention to provide a programmable apparatus for 
receiving instructions from a programmer and causing an action to occur on the happening of an 
event, comprising an input device, producing an input instruction signal; a control means for 
receiving the input instruction signal, and storing a program instruction associated with the input 
instruction signal, the control means storing sufficient program instructions to perform an action 
on the occurrence of an event, the control means monitoring a status of the apparatus to 
determine the occurrence of various events, comparing the determined events with the program 
instructions, and performing the action on the occurrence of the event; a display means for 
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interactively displaying information related to the instructions to be received, and responsive 
thereto, controlled by the control means, so that the programmer is presented with feedback on a 
current state of the apparatus and the program instruction; wherein the control means further 
comprises means for detecting one or more characteristics of the input instruction signal 
5 independent of the program instruction selected from the group consisting of a velocity 
component, an efficiency of input, an accuracy of input, an interruption of input, a high 
frequency component of input and a past history of input by the programmer, whereby when the 
control means detects a characteristic indicating that the display means is displaying information 
in a suboptimal fashion, the control means controls the display means to display information in a 
10 more optimal fashion. 
^ It is also an object of the present invention to provide a programmable apparatus for 

~; receiving instructions from a programmer and causing an action to occur on the happening of an 

if: 

%B event, comprising an input device, producing an input instruction signal; a control means for 
p receiving the input instruction signal, and storing a program instruction associated with the input 
lJ5 instruction signal, the control means storing sufficient program instructions to perform an action 
JL on the occurrence of an event, the control means monitoring a status of the apparatus to 
fy determine the occurrence of various events, comparing the determined events with the program 
pi instructions, and performing the action on the occurrence of the event; a display means for 
J=f interactively displaying information related to the instructions to be received, and responsive 
20 thereto, controlled by the control means, so that the programmer is presented with feedback on a 
current state of the apparatus and the program instruction; wherein the control means further 
comprises means for detecting a need by the programmer for more detailed information 
displayed on the display means, by detecting one or more characteristics of the input instruction 
signal independent of the program instruction selected from the group consisting of a velocity 
25 component, an efficiency of input, an accuracy of input, an interruption of input, a high 

frequency component of input and a past history of input by the programmer, whereby when the 
control means detects a characteristic indicating that the display means is insufficiently detailed 
information, the control means controls the display means to display more detailed information. 
It is a further object of the present invention to provide a programmable apparatus having 
30 a data input, the apparatus receiving instructions from a programmer and causing an action to 
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occur on the receipt of data indicating an event, comprising an input device, producing an input 
instruction signal; a control means for receiving the input instruction signal, and storing a 
program instruction associated with the input instruction signal, the control means storing 
sufficient program instructions to perform an action on the receipt of data indicating an event, the 
5 control means monitoring the data input; a display means for interactively displaying information 
related to the instructions to be received, and responsive thereto, controlled by the control means, 
so that the programmer is presented with feedback on a current state of the apparatus and the 
program instruction; wherein the control means receives a programming preference indicating a 
desired event from the input device which does not unambiguously define the event, and the 
10 control means monitors the data and causes the occurrence of the action when a correlation 
q between the programming preference and the monitored data is above a predetermined threshold, 
jfji indicating a likely occurrence of the desired event. It is also object of the present invention to 
r 3 ; provide the programmable aforementioned apparatus, wherein the input device is remote from 
O the display means, and provides a direct manipulation of display information of the display 
yl5 means, further comprising means for verifying the program instructions so that the program 
L instructions are executable by the control means. The control means may further comprise a 
jV calendar or other chronological database. 

y 

fy Another object of the present invention provides a programmable information storage 

S apparatus having a data input, for receiving data to be stored, the apparatus receiving instructions 
20 from a programmer and causing an action to occur on the receipt of data indicating an event, 
comprising means for storing data from the data input; an input device, producing an input 
instruction signal; a control means for receiving the input instruction signal, and storing a 
program instruction associated with the input instruction signal, the control means storing 
sufficient program instructions to perform an action on the receipt of data from the data input 
25 indicating an event, the control means monitoring the data input to determine the occurrence of 
various events, comparing the determined events with the program instructions, and performing 
for storing the data the action on the occurrence of the event; wherein the control means receives 
identifying data from at least one of the input device and the data input, the identifying data 
being stored separately from the input data on a storage medium. The programmable 
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information storage apparatus may also include means for reading the identifying data stored 
separately on the storage medium, and may also receive as an input the identifying data. 

It is also an object of the present invention to provide a programmable apparatus, wherein 
the control means provides an option, selectable by the input means in conjunction with the 
display means, for changing an input program instruction prior to execution by the control 
means, so that the apparatus enters a state wherein a new program instruction may be input to 
substitute for the changed input step, wherein the control means verifies the program instructions 
so that the program instructions are executable by the control means. 

It is still another object of the present invention to provide a programmable apparatus, 
wherein the control means further causes the display means to display a confirmation screen after 
the program instructions are input, so that the programmer may confirm the program instructions. 

Another object of the present invention is to provide a programmable information storage 
apparatus, wherein the control means further comprises means for recognizing character data 
present in a data stream of the input data, the identifying data comprising the recognized 
character data. 

It is a still further object of the present invention to provide a video tape recording 
apparatus, comprising a video signal receiving device, a recording device for recording the video 
signal, wherein the control analyzes the video signal for the presence of a symbol, and recognizes 
the symbol as one of a group of recognized symbols, and the control stores the recognized 
symbol separately from the video signal. 

Another object of the present invention is to provide a recording device for recording an 
analog signal sequentially on a recording medium, comprising means for characterizing the 
analog signal, wherein data representing the characterization and a location of the analog signal 
on the recording medium are stored in a directory location on the recording medium separately 
from the analog signal. 

It is a further object of the present invention to provide an interface for a programmable 
control for input of a program for a controller to execute, which performs an action based on an 
external signal, comprising an input device, a controller for receiving data from the input device 
and from an external stimulus, a plant being controlled by the controller based on an input from 
the input device and the external stimulus, and a display device being controlled by the 
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controller, for providing visual feedback to a user operating the input device, wherein a 
predetermined logical sequence of programming options is presented to the user on the display 
device, in a plurality of display screens, each of the display screens differing in available 
programming choices; the logical sequence including a correct sequence of choices to set an 
operable control program, so that no necessary steps are omitted; the external stimulus comprises 
a timing device, and the display comprises a display option for programming the plant to perform 
an action at a time which is input through the input device as a relative position on the display 
device, the relative position including a means for displaying an absolute time entry and means 
for displaying a relative time entry, the display also comprising a display option means for 
performing an action at a time; the control comprises means for presenting the user, on the 
display device, with a most probable action, which may be selected by the user through 
activation of the input device without entering data into the controller through the input device 
relating to both the action and the event; the display also comprising means for indicating 
completion of entry of a programming step, which means indicates to the user an indication that 
the programming step is not completed if information necessary for execution of the step is not 
available to the controller; and the controller being capable of controlling the display device to 
present information to the user relating to the use of the apparatus if necessary for use of the 
device by the user. 

Another object of the present invention provides a system for presenting a program to a 
viewer, comprising a source of program material; means for determining a viewer preference, the 
viewer preference optionally being context sensitive; means for receiving the program material 
from the source; means for characterizing the program material based on its content; means for 
correlating the characterized content of the program material with the determined viewer 
preference to produce a correlation index; and means for presenting the program material to the 
viewer, if the correlation index indicates a probable high correlation between the characterization 
of the program material and the viewer preference. 

Another object of the present invention is to provide a system for presenting a program to 
a viewer, comprising a source of program material; means for determining a viewer preference; 
means for receiving the program material from the source; means for storing the program 
material; means for preprocessing the program material to produce a reduced data flow 
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information signal retaining information relating to a character of the program material and 
eliminating data not necessary to characterize the program material; means for characterizing the 
information signal based on its content; means for correlating the characterized content of the 
information signal with the determined viewer preference to produce a correlation index; and 
means for presenting the stored program material to the viewer, if the correlation index indicates 
a probable high correlation between the characterization of the information signal and the viewer 
preference. The system may also include a means for storing the information signal, wherein the 
characterizing means characterizes the stored information signal, and also a memory for storing 
the program material while the characterizing means produces characterized content and the 
correlating means produces the correlation index. 

Still another object of the present invention is to provide a system, wherein the program 
material is encrypted, further comprising means for decrypting the program material to produce a 
decryption event; and means for charging an account of the viewer based on the occurrence of a 
decryption event. Thus, a decryption processor and an accounting database are provided for 
these purposes. 

Another object of the present invention is to allow the means for characterizing the 
program material to operate without causing a decryption event. Thus, the data stream may 
include characterization data specifically suitable for processing by a characterizing system, or 
the decryption processor may be provided with multiple levels of functionality, or both. Further, 
the system may comprise a memory for storing the program material while the characterizing 
means produces characterized content and the correlating means produces the correlation index. 
The characterizing means may also characterize the program material stored in memory, and the 
program material stored in memory may be compressed. 

Another object of the present invention is to provide a controller for controlling a plant, 
having a sensor for sensing an external event and producing a sensor signal, an actuator, 
responsive to an actuator signal, for influencing the external event, and a control means for 
receiving the sensor signal and producing an actuator signal, comprising means for inputting a 
program; means for storing the program; means for characterizing the sensor signal to produce a 
characterized signal; and means for comparing the characterized signal with a pattern stored in a 
memory to produce a comparison index, wherein the actuator signal is produced on the basis of 
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the comparison index and the program, wherein the characterization comprises an Affine 
transformation of the sensor signal. The characterization may comprise one or more 
transformation selected from the group consisting of an Affine transformation, a Fourier 
transformation, a Gabor transformation, and a wavelet transformation. 

It is another object of the present invention to provide a method for automatically 
recognizing digital image data consisting of image information, the method comprising the steps 
performed by a data processor of storing a plurality of templates; storing the image data in the 
data processor; generating a plurality of addressable domains from the stored image data, each of 
the domains representing a portion of the image information; creating, from the stored image 
data, a plurality of addressable mapped ranges corresponding to different subsets of the stored 
image data, the creating step including the substep of (a) executing, for each of the mapped 
ranges, a corresponding procedure upon the one of the subsets of the stored image data which 
corresponds to the mapped ranges: (b) assigning identifiers to corresponding ones of the mapped 
ranges, each of the identifiers specifying for the corresponding mapped range a procedure and a 
address of the corresponding subset of the stored image data; (c) optionally subjecting a domain 
to a transform selected from the group consisting of a predetermined rotation, an inversion, a 
predetermined scaling, and a predetermined preprocessing in the time, frequency, and/or wavelet 
domain; (d) selecting, for each of the domains or transformed domains, the one of the mapped 
ranges which most closely corresponds according to predetermined criteria; (e) representing the 
image information as a set of the identifiers of the selected mapped ranges; and (f) selecting, 
from the stored templates, a template which most closely corresponds to the set of identifiers 
representing the image information. The step of selecting the mapped ranges may also include 
the substep of selecting, for each domain, a most closely corresponding one of the mapped 
ranges. 

It is another object of the present invention to provide a method wherein the step of 
selecting the most closely corresponding one of the mapped ranges includes the step of selecting, 
for each domain, the mapped range which is the most similar, by a method selected from one or 
more of the group consisting of selecting minimum Hausdorff distance from the domain, 
selecting the highest cross-correlation with the domain, selecting the highest fuzzy correlation 
with the domain and selecting the minimum mean square error with the domain. 
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Another object of the present invention provides a method wherein the step of selecting 
the most closely corresponding one of mapped ranges includes the step of selecting, for each 
domain, the mapped range with the minimum modified Hausdorff distance calculated as 
D[db,mrb] + D[l - db,l - mrb], where D is a distance calculated between a pair of sets of data 
each representative of an image, db is a domain, mrb is a mapped range, 1 - db is the inverse of a 
domain, and 1-mrb is an inverse of a mapped range. 

Another object of the present invention provides a method wherein the digital image data 
consists of a plurality of pixels each having one of a plurality of associated color map values, 
further comprising the steps of optionally transforming the color map values of the pixels of each 
domain by a function including at least one scaling function for each axis of the color map, each 
of which may be the same or different, and selected to maximize the correspondence between the 
domains and ranges to which they are to be matched; selecting, for each of the domains, the one 
of the mapped ranges having color map pixel values which most closely correspond to the color 
map pixel values of the domain according to a predetermined criteria, wherein the step of 
representing the image color map information includes the substep of representing the image 
color map information as a set of values each including an identifier of the selected mapped 
range and the scaling functions; and selecting a most closely corresponding stored template, 
based on the identifier of the color map mapped range, the scaling functions and the set of 
identifiers representing the image information. The first criteria may comprise minimizing the 
Hausdorff distance between each domain and the selected range. 

Another object of the present invention is to provide a method further comprising the 
steps of storing delayed image data, which represents an image of a moving object differing in 
time from the image data in the data processor; generating a plurality of addressable further 
domains from the stored delayed image data, each of the further domains representing a portion 
of the delayed image information, and corresponding to a domain; creating, from the stored 
delayed image data, a plurality of addressable mapped ranges corresponding to different subsets 
of the stored delayed image data; matching the further domain and the domain by subjecting a 
further domain to one or both of a corresponding transform selected from the group consisting of 
a null transform, a rotation, an inversion, a scaling, a translation and a frequency domain 
preprocessing, which corresponds to a transform applied to a corresponding domain, and a 
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noncorresponding transform selected from the group consisting of a rotation, an inversion, a 
scaling, a translation and a frequency domain preprocessing, which does not correspond to a 
transform applied to a corresponding domain; computing a motion vector between one of the 
domain and the further domain, or the set of identifiers representing the image information and 
the set of identifiers representing the delayed image information, and storing the motion vector; 
compensating the further domain with the motion vector and computing a difference between the 
compensated further domain and the domain; selecting, for each of the delayed domains, the one 
of the mapped ranges which most closely corresponds according to predetermined criteria; 
representing the difference between the compensated further domain and the domain as a set of 
difference identifiers of a set of selected mapping ranges and an associated motion vector and 
representing the further domain as a set of identifiers of the selected mapping ranges; 
determining a complexity of the difference based on a density of representation; and when the 
difference has a complexity below a predetermined threshold, selecting, from the stored 
templates, a template which most closely corresponds to the set of identifiers of the image data 
and the set of identifiers of the delayed image data. 

Another object of the present invention provides an apparatus for automatically 
recognizing digital image data consisting of image information, comprising means for storing 
template data; means for storing the image data; means for generating a plurality of addressable 
domains from the stored image data, each of the domains representing a different portion of the 
image information; means for creating, from the stored image data, a plurality of addressable 
mapped ranges corresponding to different subsets of the stored image data, the creating means 
including means for executing, for each of the mapped ranges, a procedure upon the one of the 
subsets of the stored image data which corresponds to the mapped range; means for assigning 
identifiers to corresponding ones of the mapped ranges, each of the identifiers specifying for the 
corresponding mapped range an address of the corresponding subset of stored image data; means 
for selecting, for each of the domains, the one of the mapped ranges which most closely 
corresponds according to predetermined criteria; means for representing the image information as 
a set of the identifiers of the selected mapped ranges; and means for selecting, from the stored 
templates, a template which most closely corresponds to the set of identifiers representing the 
image information. 
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It is also an object of the present invention to provide a method and system for processing 
broadcast material having a first portion and a second portion, wherein the first portion 
comprises an content segment and the second portion comprises a commercial segment, in order 
to allow alteration in the presentation of commercial segments, based on the recipient, 
5 commercial sponsor, and content provider, while providing means for accounting for the entire 
broadcast. 

Another object of an embodiment of the present invention provides an apparatus 
comprising a user interface, receiving a control input and a user attribute from the user; a 
memory system, storing the control input and user attribute; an input for receiving content data; 
10 means for storing data describing elements of the content data; means for presenting information 
to the user relating to the content data, the information being for assisting the user in defining a 

w 

control input, the information being based on the stored user attribute and the data describing 

cs=: 

elements of the content data; and means for processing elements of the content data in 

pi dependence on the control input, having an output. This apparatus according to this embodiment 

/is may be further defined as a terminal used by users of a television program delivery system for 

s suggesting programs to users, wherein the user interface comprises means for gathering the user 
Q 

pi specific data to be used in selecting programs; the memory system comprises means, connected 

Jri to the gathering means, for storing the user specific data; the input for receiving data describing 

O elements of the content data comprises means for receiving the program control information 

D 

20 containing the program description data; and the processing means comprises program selection 
means, operably connected to the storing means and the receiving means, for selecting one or 
more programs using a user's programming preferences and the program control information. In 
this case, the program selection means may comprise a processor, wherein the user programming 
preferences are generated from the user specific data; and means, operably connected to the 

25 program selection means, for suggesting the selected programs to the user. The apparatus 

processing means selectively may records the content data based on the output of the processing 
means. Further, the presenting means presents information to the user in a menu format. The 
presenting means may comprises means for matching the user attribute to content data. 

The data describing elements of an associated data stream may, for example, comprise a 

30 program guide generated remotely from the apparatus and transmitted in electronically accessible 
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form; data defined by a human input, and/or data defined by an automated analysis of the content 
data. 

According to another embodiment, the present invention comprises a method, comprising 
the steps of receiving data describing an user attribute; receiving a content data stream, and 
extracting from the content data stream information describing a plurality of program options; 
and processing the data describing a user attribute and the information describing a plurality of 
program options to determine a likely user preference; selectively processing a program option 
based on the likely user preference. The method may be embodied in a terminal for a television 
program delivery system for suggesting programs to users for display on a television using 
program control information and user specific data. In that case, the step of receiving data 
describing an user attribute may comprise gathering user specific data to be used in selecting 
programs, and storing the gathered user specific data; the step of receiving a content data stream, 
may comprise receiving both programs and program control information for selecting programs 
as the information describing a plurality of program options; the selectively processing step may 
comprise selecting one or more programs using a user's programming preferences and the 
received program control information, wherein the user programming preferences are generated 
from the user specific data; and the method further including the step of presenting the program 
or information describing a program option for the selected programs to the user. 

The user attribute may comprise a semantic description of a preference, or some other 
type of description, for example a personal profile,, a mood, a genre, an image representing or 
relating to a scene, a demographic profile, a past history of use by the user, a preference against 
certain types of media, or the like. In the case of a semantic preference, the data processing step 
may comprise determining a semantic relationship of the user preference to the information 
describing a plurality of program options. The program options may, for example, be transmitted 
as an electronic program guide, the information being in-band with the content (being transmitted 
on the same channel), on a separate channel or otherwise out of band, through a separate 
communications network, e.g., the Internet, dial-up network, or other streaming or packet based 
communications system, or by physical transfer of a computer-readable storage medium, such as 
a CD-ROM or floppy disk. The electronic program guide may include not only semantic or 
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human-readable information, but also other types of metadata relating to or describing the 
program content. 

In a further embodiment of the present invention, it is an object to provide a device for 
identifying a program in response to user preference data and program control information 
concerning available programs, comprising means for gathering the user preference data; means, 
connected to the gathering means, for storing the gathered user preference data; means for 
accessing the program control information; and means, connected to the storing means and 
accessing means, for identifying one or more programs based on a correspondence between a 
user's programming preferences and the program control information. For example, the 
identifying means identifies a plurality of programs, a sequence of identifications transmitted to 
the user being based on a degree of correspondence between a user's programming preferences 
and the respective program control information of the identified program. The device my 
selectively record or display the program, or identify the program for the user, who may then 
define the appropriate action by the device. Therefore, a user may, instead of defining 'Mike 1 ' 
preferences, may define "dislike" preference, which are then used to avoid or filter certain 
content. Thus, this feature may be used for censoring or parental screening, or merely to avoid 
unwanted content. Thus, the device comprises a user interface adapted to allow interaction 
between the user and the device for response to one or more of the identified programs. The 
device also preferably comprises means for gathering the user specific data comprises means for 
monitoring a response of the user to identified programs. 

It is a further object of the invention to provide a device which serves as a set top terminal 
used by users of a television program delivery system for suggesting programs to users using 
program control information containing scheduled program description data, wherein the means 
for gathering the user preference data comprising means for gathering program watched data; the 
means, connected to the gathering means, for storing the gathered user preference data 
comprising means, connected to the gathering means, for storing the program watched data; the 
means for accessing the program control information comprising means for receiving the 
program control information comprising the scheduled program description data; the means, 
connected to the storing means and accessing means, for identifying one or more programs based 
on a correspondence between a user's programming preferences and the program control 
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information, being for selecting at least one program for suggestion to the viewer, comprising: 
means for transforming the program watched data into preferred program indicators, wherein a 
program indicator comprises a program category with each program category having a weighted 
value; means for comparing the preferred program indicators with the scheduled program 
description data, wherein each scheduled program is assigned a weighted value based on at least 
one associated program category; means for prioritizing the scheduled programs from highest 
weighted value programs to lowest weighted value programs; means for indicating one or more 
programs meeting a predetermined weight threshold, wherein all other programs are excluded 
from program suggestion; and means, operably connected to the program selection means, for 
displaying for suggestion the selected programs to the user. 

It is a further aspect of the invention to provide device a device comprising: a data 
selector, for selecting a program from a data stream; an encoder, for encoding programs in a 
digitally compressed format; a mass storage system, for storing and retrieving encoded programs; 
a decoder, for decompressing the retrieved encoded programs; and an output, for outputting the 
decompressed programs. 

Therefore, the present invention provides a system and method for making use of the 
available broadcast media forms for improving an efficiency of matching commercial 
information to the desires and interests of a recipient, improving a cost effectiveness for 
advertisers, improving a perceived quality of commercial information received by recipients and 
increasing profits and reducing required information transmittal by publishers and media 
distribution entities. 

This improved advertising efficiency is accomplished by providing a system for collating 
a constant or underlying published content work with a varying, demographically or otherwise 
optimized commercial information content. This commercial information content therefore need 
not be predetermined or even known to the publisher of the underlying works, and in fact may be 
determined on an individual receiver basis. It is also possible to integrate the demographically 
optimized information within the content. For example, overlays in traditional media, and 
electronic substitutions or edits in new media, may allows seamless integration. The content 
alteration need not be only based on commercial information, and therefore the content may vary 
based on the user or recipient. 
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The technologies emphasize adaptive pattern recognition of both the user input and data, 
with possible use of advanced signal processing and neural networks. These systems may be 
shared between the interface and operational systems, and therefore a controller for a complex 
system may make use of the intrinsic processing power available, rather than requiring additional 
computing resources, although this unification is not required. In fact, while hardware efficiency 
dictates that near term commercial embodiments employ common hardware for the interface 
system and the operational system, future designs may successfully separate the interface system 
from the operational system, allowing portability and efficient application of a single interface 
system for a number of operational systems. 

The adaptive nature of the technologies derive from an understanding that people learn 
most efficiently through the interactive experiences of doing, thinking, and knowing. Users 
change in both efficiency and strategy over time. To promote ease-of-use, efficiency, and lack of 
frustration of the user, the interface of the device is intuitive and self explanatory, providing 
perceptual feedback to assist the operator in communicating with the interface, which in turn 
allows the operational system to identify of a desired operation. Another important aspect of 
man-machine interaction is that there is a learning curve, which dictates that devices which are 
especially easy to master become frustratingly elemental after continued use, while devices 
which have complex functionality with many options are difficult to master and may be initially 
rejected, or used only at the simplest levels. The present technologies address these issues by 
determining the most likely instructions of the operator, and presenting these as easily available 
choices, by analyzing the past history data and by detecting the "sophistication" of the user in 
performing a function, based on all information available to it. The context of use is also a factor 
in many systems. The interface seeks to optimize the interface adaptively and immediately in 
order to balance and optimize both quantitative and qualitative factors. This functionality may 
greatly enhance the quality of interaction between man and machine, allowing a higher degree of 
overall system sophistication to be tolerated. 

The interface system analyzes data from the user, which may be both the selections made 
by the user in context, as well as the efficiency by which the user achieves the selection. Thus, 
information concerning both the endpoints and path are considered and analyzed by the human 
user interface system. 
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The interface may be advantageously applied to an operational system that has a plurality 
of functions, certain of which are unnecessary or are rarely used in various contexts, while others 
are used with greater frequency. In such systems, the application of functionality may be 
predictable. Therefore, the present technologies provide an optimized interface system that, upon 
5 recognizing a context, dynamically reconfigures the availability or ease of availability of 
functions and allows various functional subsets to be used through "shortcuts". The interface 
presentation will therefore vary over time, use and the particular user. 

The advantages to be gained by using an intelligent data analysis interface for facilitating 
user control and operation of the system are more than merely reducing the average number of 
10 selections or time to access a given function. Rather, advantages also accrue from providing a 
O means for access and availability of functions not necessarily previously existing or known to the 
lS user, improving the capabilities and perceived quality of the product. 

Further improvements over prior interfaces are also possible due to the availability of 
B pattern recognition functionality as a part of the interface system. In those cases where the 
yis pattern recognition functions are applied to large amounts of data or complex data sets, in order 

L to provide a sufficient advantage and acceptable response time, powerful computational 

U 

W resources, such as powerful RISC processors, advanced DSPs or neural network processors are 
o 

nj made available to the interface system. On the other hand, where the data is simple or of limited 
scope, aspects of the technology may be easily implemented as added software-based 
20 functionality in existing products having limited computational resources. 

The application of these technologies to multimedia data processing systems provides a 
new model for performing image pattern recognition and for the programming of applications 
including such data. The ability of the interface to perform abstractions and make decisions 
regarding a closeness of presented data to selection criteria makes the interface suitable for use in 
25 a programmable control, i.e., determining the existence of certain conditions and taking certain 
actions on the occurrence of detected events. Such advanced technologies might be especially 
valuable for disabled users. 

In a multimedia environment, it may be desirable for a user to perform an operation on a 
multimedia data event. Past systems have required explicit indexing or identification of images 
30 and events. The present technologies, however, allow an image, diagrammatic, abstract or 
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linguistic description of the desired event to be acquired by the interface system from the user 
and applied to identify or predict the multimedia event(s) desired, without requiring a separate 
manual indexing or classification effort. These technologies may also be applied to single media 
data. 

The interface system analyzes data from many different sources for its operation. Data 
may be stored or present in a dynamic data stream. Thus, in a multimedia system, there may be a 
real-time video feed, a stored event database, as well as an exemplar or model database. Further, 
since the device is adaptive, information relating to past experience of the interface, both with 
respect to exposure to data streams and user interaction, is also stored. 

This data analysis aspect of the interface system may be substantially processor intensive, 
especially where the data includes abstract or linguistic concepts or images to be analyzed. 
Interfaces that do not relate to the processing of such data may be implemented with simpler 
hardware. On the other hand, systems that handle complex data types may necessarily include 
sophisticated processors, adaptable for use by the interface system. A portion of the data 
analysis may also overlap the functional analysis of the data for the operational system. 

Other objects and features of the present invention will become apparent from the 
following detailed description considered in conjunction with the accompanying drawings. It is 
to be understood, however, that the drawings are designed solely for the purposes of illustration 
and not as a definition of the limits of the invention, for which reference should be made to the 
appended claims. 
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BRIEF DESCRIPTION OF THE DRAWINGS 
Embodiments of the present invention are shown in the figures in the drawings, in which: 
Fig. 1 is a flow chart of the steps required to set a VCR; 

Fig. 2 shows a graphical comparison of required and extra keypresses for the prior art and 
the interface of the present invention; 

Fig. 3 graphically shows the differences in seconds between total time for the prior art for 
each user; 

Fig. 4 graphically shows the differences in seconds between total time for the interface of 
the present invention for each user; 

Fig. 5 graphically shows the programming steps for the comparison of the prior art and 
the interface of the present invention; 

Fig. 6 graphically shows comparative statistics by user comparing the prior art and the 
interface of the present invention; 

Figs. 7 and 8 graphically show the critical steps in programming the prior art and the 
interface of the present invention; 

Fig. 9 graphically shows the number of keypresses made by test participants comparing 
the prior art and the interface of the present invention; 

Fig. 10 graphically shows the comparison of the actual and theoretical number of 
keypresses necessary for programming the prior art and the interface of the present invention; 

Fig. 11 graphically compares the actual and theoretical time necessary for programming 
the prior art and the interface of the present invention; 

Figs. 12a and 12b graphically compares the actual and theoretical time necessary for 
setting the programs in the prior art and the interface of the present invention; 

Figs. 13 and 14 graphically show the percentage time for the critical steps in 
programming the prior art and the interface of the present invention; 

Fig. 15 is a flow diagram of a predictive user interface of the present invention; 

Fig. 16 is a flow diagram of the program input verification system of the present 
invention; 

Fig. 17 is a flow diagram of a predictive user preference aware interface of the present 
invention; 
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Fig. 18 is a block diagram of a non-program information feature extraction circuit of the 
present invention; 

Fig. 19 is a diagram of a block of information for a catalog entry of the present invention; 
Fig. 20 is a block diagram of a digital information and analog signal reading/recording 
apparatus; 

Fig. 21 is a block diagram of a user level determining system of the present invention; 
Fig. 22 is a block diagram of a template-based pattern recognition system of the present 
invention; 

Fig. 23 is a block diagram of a control system of the present invention incorporating a 
pattern recognition element and an interface; 

Fig. 24 is a block diagram of a control system for characterizing and correlating a signal 
pattern with a stored user preference of the present invention; 

Fig. 25 is a block diagram of a multiple video signal input apparatus, with pattern 
recognition, data compression, data encryption, and a user interface of the present invention; 

Fig. 26 is a block diagram of a control system for matching a template with a sensor 
input, of the present invention; 

Figs. 27, 28 and 29 are flow diagrams of an iterated function system method for 
recognizing a pattern according to the present invention; 

Fig. 30 is a semi-cartoon flow diagram of the object decomposition and recognition 
method of the present invention; and 

Fig. 31 is a block diagram of an adaptive interface system according to the present 
invention. 
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DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The preferred embodiments of the present invention will now be described with reference 

to the Figures. Identical elements in the various figures are designated with the same reference 

numerals. 

EXAMPLE 1 
VCR INTERFACE 

A preferred embodiment of the interface of the present invention, described in the present 
example, provides automatic sequencing of steps, leading the user through the correct sequence 
of actions to set a program on the screen, so that no necessary steps are omitted, and no optional 
steps are accidentally or unintentionally omitted. These steps are shown diagrammatically in 
Fig. 15 of the present invention. In addition, such a system does not burden the user with the 
necessity of inputting superfluous information, nor overwhelm the user with the display of 
unnecessary data. See, Hoffberg, Linda I., "AN IMPROVED HUMAN FACTORED 
INTERFACE FOR PROGRAMMABLE DEVICES: A CASE STUDY OF THE VCR", 
Master's Thesis, Tufts University; Hoffberg, Linda I., "Designing User Interface Guidelines For 
Time-Shift Programming of a Video Cassette Recorder (VCR)", Proc. of the Human Factors Soc. 
35th Ann. Mtg. pp. 501-504 (1991); and Hoffberg, Linda I., "Designing a Programmable 
Interface for a Video Cassette Recorder (VCR) to Meet a User's Needs", Interface 91 pp. 346-351 
(1991). See also, U.S. Patent Application No. 07/812,805, incorporated herein by reference in its 
entirety, including appendices and incorporated references. 

Many design considerations were found to be important in the improved interface of the 
present invention: 

The interface should preferably employ only minimal amounts of abbreviations and the 
use of complete words is especially preferred, except where a standard abbreviation is available 
or where an "iconic" or symbolic figure or textual cue is appropriate. Thus, standard 
abbreviations and symbols are acceptable, and displayed character strings may be shortened or 
truncated in order to reduce the amount of information that is to be displayed, where necessary or 
desirable. An option may be provided to the user to allow full words, which may decrease the 
information which may be conveyed on each screen and increase the number of screens that must 
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be displayed, or abbreviations and symbols, which may minimize the number of displayed 
screens of information, thus allowing the user to make the compromise. This aspect of the 
system may also be linked to the adaptive user level function of the present invention, wherein 
abstract symbols and abbreviations are presented to advanced users, while novices are presented 
with full words, based on an implicit indication of user level. These abstract symbols and 
abbreviations may be standard elements of the system, or user designated icons. Of course, the 
user could explicitly indicate his preference for the display type, thus deactivating the automatic 
adaptive user level function. 

If multiple users use the device, then the device identifies the relevant users. This may be 
by explicit identification by keyboard, bar code, magnetic code, smart card (which may 
advantageously include a user profile for use with a number of devices), an RF-ID or IR-ID 
transponder, voice recognition, image recognition, or fingerprint identification. It is noted that 
smart cards or other intelligent or data-containing identifications systems may be used with 
different types of devices, for example video, audio, home appliances, HVAC and automobile 
systems. 

Where a new user is identified to the system, an initial query may be made to determine 
an optimum initial user level. This allows further identification of the user and preference 
determination to occur more efficiently. 

In applications in which a user must program an event on a certain date, at a certain time, 
a built-in calendar menu screen is preferably employed so that the user cannot set the device with 
a program step that relies on a non-existent date. Technology that will help eliminate the human 
problem of setting the wrong (yet existing) date may also be employed. Such technology might 
include accessing an on-line or other type of database containing media programming 
information, and prompting the user regarding the selected choice. In situations where it is 
applicable, the interface should indicate to the user the number of characters the interface is 
expecting, such as when entering the year. 

The interface system provides an easily accessible CHANGE, CANCEL or UNDO 
(single or multiple level) feature, which facilitates backtracking or reprogramming the 
immediately previously entered information rather than forcing the user to repeat all or a 
substantial portion of the programming steps. A method of the type described is shown in Fig. 
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16 of the present invention. User input is also facilitated by the provision of frequently used 
settings as explicit choices, such as, referring to the VCR example, "Record today," "Record 
tomorrow," "Noon," and "Midnight," so that the user does not have to specify a date in these 
cases. This will eliminate extra keypresses, and reduce the programming time. In addition, this 
could eliminate user errors. Frequently used choices for program selections are also provided to 
the user to reduce the number of programming steps necessary and provide the user with all the 
frequently used selections. The especially preferred choices are "Once On .", "Once a Week on 
.", "Monday - Friday at .", "Everyday at .". These redundant, complex instructions reduce the 
number of keystrokes required for data entry, and reduce the amount of programming time 
required. 

The presently described interface system also provides, in the event that a color screen is 
available, conservatively used color coding, which allows the user to effectively and quickly 
acknowledge the function of each aspect of the screen. When programming, the preferred colors 
are royal blue for "help," red for mistakes, light blue for information previously entered, and 
yellow for current information being entered. Of course, other colors could be used, according to 
the user's or designer's preference, cultural differences, and display parameters. 

When viewing, it is preferable that screen colors change to indicate status changes, such 
as viewed/unviewed, or to categorize the shows. 

The interface includes a confirmation screen which displays to the user all of the 
categories and selections previously explicitly entered or otherwise inferred, and should be easily 
understandable. This is shown in Fig. 15 of the present invention. All of the necessary 
information is displayed on this screen, in addition to the change and cancel options, if possible. 

The entering of information on each screen is preferably consistent throughout the 
various interface options and levels. All of the screens preferably have similar layouts. 
"Buttons" or screen locations which are keyed to a particular function, which appear on multiple 
screens, should appear in approximately the same location on all screens. However, in certain 
cases, relatively more important information on a given screen may be displayed more 
prominently, and possibly in a different screen location, in order to reduce the search time. 
Further, when other factors dictate, each screen may be independently optimized for the 
prescribed function. For example, a representation of an analog clock dial may be used to set 
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time information. However, even if the format does change, a standard scheme should be 
maintained, such as the use of a particular color to indicate that a particular program aspect has 
been changed. 

The interface should display data consistent with standards and conventions familiar to 
users. For, e.g., when entering dates, users are most familiar with calendars. However, this type 
of presentation of choices does not eliminate the human problem of entering incorrect 
information, e.g., setting a wrong, but existing, date. The problem of ensuring the accuracy of 
user input may be addressed by an intelligent interface which stores data concerning 
programming, user preferences, and by means of some logical method, such as Boolean logic, 
fuzzy logic, neural network theory, or any other system which may be used to generate a 
prediction, to determine if an entry is likely in error, by comparing the prediction with the entry. 
Of course, these predictive systems would also provide an initial default entry, so that an a priori 
most probably action or actions may be initially presented to the user. 

In addition to following conventions of information presentation to the user, the interface 
of the present invention may also provide emulations of other user interfaces of which a 
particular user may be familiar, even if these are not optimized according to the presently 
preferred embodiments of the present invention, or not otherwise well known. These emulations 
need not even be of the same type of device, so that a broad based standard for entry of 
information into a programmable controls, regardless of their type, may be implemented. By 
allowing emulation, the interface could provide compatibility with a standard or proprietary 
interface, with enhanced functionality provided by the features of the present interface. 

These enhanced functional intelligent aspects of the controller may be implemented by 
means of software programming of a simple microcomputer, or by use of more specialized 
processors, such as a Fuzzy Set Processor (FSP) or Neural Network Processor to provide real- 
time responsiveness, eliminating delays associated with the implementation of complex 
calculations on general purpose computing devices. 

In the various embodiments according to the present invention, various control strategies 
are employed. Depending on the application, fuzzy set processors (FSP's) may be preferred 
because they have the advantage of being easier to program through the use of presumptions or 
rules for making the fuzzy inferences, which may be derived by trial and error or the knowledge 
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of experts, while Neural Networks are less easily explicitly programmed and their network 
weighing values are not easily understood in the abstract, but these systems may be applied to 
learn appropriate responses from test data. Thus, neural networks tend to require extensive 
"training", while Fuzzy Set Processors may be explicitly programmed without the need of 
duplicating or simulating actual operating conditions, but may require "fine tuning". 

The most frequently used choices preferably should be displayed as the default setting. 
The screen cursor preferably appears at the "accept" screen button, when the screen is displayed. 
This default can either be set in advance, or acquired by the system. In the case of acquired 
defaults, these may be explicitly set by the user or adaptively acquired by the system through use. 
The interface of the present invention may be taught, in a "teach" mode, the preferences of the 
user, or may also acquire this information by analyzing the actual choices made by the user 
during operation of the interface and associated controller. This type of operation is shown 
schematically in Fig. 15 of the present invention. The options of "Midnight" (12:00 AM) and 
"Noon" (12:00 PM) should preferably be present, as some people often become confused when 
distinguishing between them. Icons, such as those indicative of the "sun" and the "moon", may 
also be used to facilitate data entry for AM and PM. The interface should preferably utilize an 
internal clock and calendar so that the user cannot set the time or program to record on a 
nonexistent date. Such a system could also compensate for daylight-savings time seasonal 
adjustments. 

The cursor is preferably distinctive and readily distinguished from other parts of the 
screen. This may be by color, attribute (i.e. blinking), size, font change of underlying text, or by 
other means. 

The user can preferably exit the programming sequence at any time by selecting a "Main 
Menu" button which may exist on the lower left-hand corner of every screen. The user is 
preferably provided with an adequate amount of feedback, and error messages should be 
directive in nature. Some form of an acknowledgement is preferably displayed after each entry. 
The user should preferably not be able to go to the next programming step until the current step 
has been completed. A message to convey why the user can not continue should appear when an 
attempt to prematurely continue is recognized. 
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• 



The "help" function is available for when the user does not know what to do. The "help" 
screen(s) preferably explains the functions of each of the available buttons or functions, but may 
also be limited to those that are ambiguous. The "help" screen may also be used to indicate a 
current status of the interface and the controller. Further, the "help" function may also provide 
5 access to various other functions, such as advanced options and configurations, and thus need not 
be limited to merely providing information on the display. The help system may incorporate a 
hypertext-type system, wherein text or information relating to concepts that are conceptually 
linked may be easily accessed by indicating to the interface system that the related information is 
desired. To eliminate the possibility of the user trying to make selections on merely informative 
10 help screens, the cursor, in these cases, should be locked to a choice which returns the user to 
fx, where they left off in the programming sequence, and this choice should be highlighted. 
J3 The "help" function may also comprise "balloon help" similar to the system adopted by 

yp Apple Computer, Inc. in Macintosh Operating System, e.g., 7.0, 7.1, 7.5, etc. 
q The interface preferably initiates the programming sequence where the user wants to be, 

y\5 so that the interface has so-called "smart screens". For example, when a VCR is first powered up 
£ or after an extended power failure, and the time and date are not stored in the machine, the "set 
p; date" and "set time" screens should appear. The sequence of screens may also vary depending on 

Jrj the system predicted requirements of the user and various aspects of the improved interface of 

i y 

C the present invention. This is shown schematically in Fig. 17 of the present invention. 

~20 The preferable input device for the interface of the present invention provides as few 

buttons as possible to achieve the required functionality, thus reducing potential user 
intimidation, focusing the user's attention on the interactive display screen, where the available 
choices are minimized to that number necessary to efficiently allow the user to program the 
discrete task presented. Such a minimization of discrete inputs facilitates a voice recognition 
25 input, which may be used as an alternative to mechanical input devices. The preferred 

embodiment includes a direct-manipulation type interface, in which a physical act of the user 
causes a proportionate change in the associated interface characteristic, such as cursor position. 
A computer mouse, e.g. a two dimensional input device, with 1 to 3 buttons is the preferred input 
device, for use with a general purpose computer as a controller, while a trackball on a remote 
30 control device is especially preferred for limited purpose controllers because they do not require 
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a flat surface for operation. Other stationary or movement sensitive input devices may, of course 
be used, such as joysticks, gyroscopes, sonic echo-location, magnetic or electrostatic location 
devices, RF phase location devices, Hallpots (joystick-like device with magnets that move with 
respect to Hall effect transducers), etc. The present interface minimizes the number of necessary 
keys present on an input device, while maintaining the functionality of the interface. It is noted 
that a strict minimization without consideration of functionality, might lead to inefficiency. For 
example, in a VCR device, if the user wants to record a program which airs Monday through 
Friday, he would have to set five separate programs, rather than one program if a "weeknights" 
choice is made available. 

The interface preferably should be easy to learn and should not require that a user have 
prior knowledge of the interface in order to use it. An attempt has been made to minimize the 
learning curve, i.e., to minimize the time it takes to learn how to use the device. 

Menu options are preferably displayed in logical order or in their expected frequencies. 
Research has shown that a menu-driven interface is best for applications involving new users and 
does not substantially hinder experienced users. Menu selection is preferably used for tasks 
which involve limited choices. They are most helpful for users with little or no training. Each 
menu should preferably allow only one selection at a time. Most of the information is preferably 
entered using a numeric keypad (entry method), rather than using up and down arrow keys 
(selection method). In addition, no leading zeros are required for entry. If there is more than one 
keystroke required, the user must then select an "OK" button to continue in the programming 
sequence. However, if the selection method is used, all of the choices are displayed on the 
screen at once. The number of steps required to complete the task through a sequence of menus 
should be minimized. The choice of words used to convey information should not be device 
specific, i.e., computer terms, but rather normal, everyday terms which are easy to understand. 
In addition, very few abbreviations should be used. All necessary information which the user 
needs should preferably be displayed at once. A user preferably should not have to rely on his 
memory or his previous experience, in order to find the correct choice, at least at the lower user 
levels. If all selections cannot be displayed at once, a hierarchical sequence is preferably used. 
A main menu should preferably provide a top level to which the user can always return and start 
over. 
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Searching and learning times should be kept to a minimum in order to obtain a 
subjectively better interface. The system's logic should reflect the users' expectations, offer 
visual clues and feedback, and stay within human memory limits. For example, the VCR should 
turn on not only with the "Power" button, but also when inserting a tape into the device. In 
addition, the sequence of steps for setting the machine to record, if the user does not indicate 
implicitly or explicitly that he knows how to use the device, should assume that the user is a 
novice, and fully prompt the user for elemental items of information. Nothing should be taken 
for granted. By developing an improved interface, an attempt is made to: reduce the searching 
time; reduce the learning time; simplify the entering of data; and, reduce the intimidation 
experienced by certain persons when using electronic devices. 

Tests by an inventor hereof show that people do not program their VCRs often, and they 
often forget the sequence of steps between recording sessions. Thus, the present invention 
preferably incorporates an adaptive user level interface, wherein a novice user is presented with a 
simpler interface with fewer advanced features initially available, so that there is reduced 
searching for the basic functions. A more advanced user is presented with more advanced 
choices and functions available initially, as compared to a novice user. 

Thus, as shown in Fig. 17, the user identifies himself to the controller in block 1701. The 
controller 1806 of Fig. 18 thereafter uses a stored profile of the identified user in controlling the 
interaction with the user, as shown in block 1702 of Fig. 17, from information stored in the 
database 1807 of Fig. 18 of the present invention. It has been found that in the case of novice 
users, a greater number of simple instructions may be more quickly and easily input rather than a 
potentially fewer number of a larger set of more complex instructions. It has further been found 
that, even if presented with a set of instructions which will allow a program to be entered with a 
fewer number of inputs, a novice user may choose to input the program using the simple 
instructions exclusively, thus employing an increased number of instructions and being delayed 
by an increased search time for those instructions that are used, from the larger set. 

Other characteristics of this interface include color coding to help prompt the user as to 
which data must be entered. Red text signifies instructions or errors, yellow text represents data 
that must be entered or has not been changed, and blue text shows newly entered program data or 
status information. Blue buttons represent buttons that should normally be pressed during the 
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programming sequence. Red buttons signify an erratic pattern in the data entry, such as the 
"cancel" and "return to main menu" buttons. Of course, these colors can be replaced by other 
display attributes, such as intensity, underline, reverse video, blinking and pixel dithering pattern, 
in addition to the use of various fonts. Such a situation would include a monochrome monitor or 
display. 

The date may be entered in the form of a calendar rather than as numbers (i.e., "9/6/91"). 
This calendar method is advantageous because users may wish to input date data in one of three 
ways: day of the week, day relative to the present, and day of the month. The present method 
allows the current date to be highlighted, so that the calendar may be used to easily enter the 
absolute day, absolute date, and relative day. Further, the choices "today" and "tomorrow", the 
most frequently used relative recording times, are included in addition to a month-by-month 
calendar. This information is provided to avoid an unnecessary waste of time and user 
frustration. Thus, another aspect of the present invention is to provide a partially redundant 
interactive display input system which allows, according to the highest probability, the choices to 
be prominently displayed and easily available, in addition to allowing random access to all 
choices. 

The present device allows common user mistakes to be recognized and possibly 
addressed, such as the confusion between 12:00 PM and 12:00 AM with midnight and noon, 
respectively. Therefore, the options of "noon" and "midnight" are provided in addition to a direct 
numeric clock input. When entering time information, leading zeros need not be entered, and 
such information may be entered in either fashion. 

The criteria for system acceptance of input depends on how many keystrokes are required 
on the screen. If only one keystroke is required to complete input of the information, upon 
depressing the key, the programming sequence will continue. If more than one keypress is 
required, the user must depress the "OK" button to continue programming. This context 
sensitive information entry serves to avoid unnecessary input. 

An on-line "help" system and on-line feedback is preferably provided to the user 
throughout various aspects of the interface. Other features include minimizing the number of 
keypresses required to program the device. These features, together with other aspects of the 
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present invention allow the user to achieve a greater efficiency with the input device than with 
prior art devices. 

The interface of the present invention applied to a VCR control preferably comprises a 
virtual keypad entry device (i.e. a representation of an array of choices), a directional input 
5 control for a cursor on a display screen, and selection buttons. The input device has an input 
corresponding to a direction of movement relative to the cursor position. Thus, since the present 
input device seeks to minimize the physical control elements of the human interface device, the 
display elements for a preferred embodiment of the present interface include: 





1. 


number keys 0-9. 


10 


2. 


enter key. 




3. 


cancel key. 




4. 


status indicator. 


yQ 


5. 


return to menu option button. 




6. 


program type indicator: program once, program once a week, program 






Monday-Friday, program everyday. 




7. 


Day indicators: 7 week days, today, tomorrow. 


V! 

•Xi 


8. 


Noon and midnight choices. 


9. 


Help button. 




10. 


Main menu options: Review, Enter new recording time, Set time, Set date. 




11. 


Timer button. 


ay 


12. 


Power button. 




13. 


AM/PM choices. 




14. 


31 day calendar. 




15. 


12 month Choices. 


C 25 


16. 


3 tape speed choices. 



User dissatisfaction is generally proportionate to the length of "search time," the time 
necessary in order to locate and execute the next desired function or instruction. Search time 
may be minimized by the inclusion of up to a maximum of 4-8 choices per screen and by use of 
30 consistent wording and placement of items on the display. 

The present invention proceeds from the understanding that there are a number of aspects 
of a programmable interface that are desirable: 

First, users should be able to operate the system successfully, without wide disparities in 
time. It should take, e.g., a normal person interacting with a VCR interface, less than seven 
35 minutes to set the time and two programs. Searching time spent in setting the clock, 

programming, getting into the correct mode, and checking whether or not the VCR is set 
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correctly should be kept to a minimum through the appropriate choices of menu layout and the 
presentation of available choices. 

Second, programming should be a stand-alone process, and not require an instruction 
manual. A help system should be incorporated in the interface. Word choices should be 
understandable, with a reduction in the use of confusing word terminology. Error messages 
should be understandable. The system should provide the ability to cancel, change or exit from 
any step. 

Third, the system should provide on-screen understandable information, with adequate 
visual feedback. The displays should be consistent.. Color coding should be employed, where 
applicable, using, e.g. blue - new input; red - error condition; yellow - static, unchanged value. 
Layouts should be logical, and follow a predictable pattern. There should be a maximum of 4-8 
choices per screen to minimize searching time. Keys should be labeled with text rather than with 
ambiguous graphics. However, a combination of both may be preferable in some cases. 

Fourth, steps required to complete tasks should be simple, require a short amount of time 
and not create user frustration. The system should guide the user along a decision path, 
providing automatic sequencing of steps. The most frequently used choices should be provided 
as defaults, and smart screens may be employed. The learning curve should be minimized 
through the use of easily understandable choices. As a user becomes more sophisticated, the 
interface may present more advanced choices. 

Fifth, there should be a reminder to set the timer and to insert the tape once the 
programming information is entered. This reminder may also be automated, to eliminate the 
commonly forgotten step of setting the timer, so that the VCR automatically sets the timer as 
soon as the necessary information is entered and a tape is inserted. Once the program is set in 
memory, a message should appear if a tape is not inserted. If the VCR is part of a "jukebox" 
(automatic changer), the tape may be automatically loaded. The VCR should preferably turn on 
when a tape is inserted. In addition, users should also be able to control the VCR with a Power 
button. 

Sixth, the VCR should be programmable from both the remote device and the control 

panel. 
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Seventh, each operation should require only one keypress, if possible, or otherwise 
reduce the number of keypresses required. There should be a 12 hour clock, not a 24 hour clock. 
There should be an on-screen keypad with entry keys, not "up" and "down" selector keys, 
allowing for the choice of specific day or time entry. There should be a "start" and a "stop" 
recording time, rather than "start" time and "length of program" or duration exclusively. The 
number of buttons on the remote control should be minimized so that as few buttons as are 
required are provided. The input device should provide for the direct manipulation of screen 
elements. A menu driven interface should be provided. 

The interface of the present invention provides an automatic sequencing of steps which 
does not normally let the user think the previous step is complete. This is shown schematically 
in Fig. 16. In this manner, important steps will not be inadvertently omitted. Upon entering the 
programming sequence, if the current date or time is not set, the interface will prompt the user to 
enter this information. Thereafter, the interface will normally default to the main menu, the most 
frequently used first screen. Thus, the interface of the present invention is adaptive, in that its 
actions depend on the current state of the device, including prior programming or use of the 
device by the user. It can be appreciated that this adaptive behavior can be extended to include 
extended "intelligence". For example, if the device is similarly programmed on a number of 
occasions, then the default setup may be adapted to a new "normal" program mode. Further, the 
apparatus could provide multiple levels of user interface, e.g. beginner, intermediate, and 
advanced, which may differ for various functions, based on the behavior of the user. This user 
interface level determining feature extraction system is shown diagrammatically in Fig. 18. In 
contrast, prior art interfaces that have different user interface levels, allow the user to explicitly 
choose the interface level, which will then be used throughout the system until reset. 

The present system allows discrete tasks to be conducted more quickly, more efficiently, 
with reduced search time and with fewer errors than prior art systems. 

EXAMPLE 2 

SERIAL RECORDING MEDIUM INDEX 

In a preferred embodiment of the present invention, in a VCR, in order to track the 
content of the tape, a directory or a catalog is recorded, preferably digitally, containing the 
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programming information, as well as additional information about the recorded programs, in a 
header, i.e., at the beginning of the tape, or at other locations on the tape. The device may also 
catalog the tape contents separately, and based on an identification of the tape, use a separately 
stored catalog. A preferred format for storing information is shown in Fig. 19. 

Thus, if there are a number of selections on the tape, the entire contents of the tape could 
be accessible quickly, without the need for searching the entire tape. In a sequential access 
medium, the tape transport apparatus must still shuttle to the location of the desired material, but 
it may do so at increased speeds, because there is no need to read the tape once the location is 
determined; after the tape transport nears the desired spot, the tape may be slowed or precisely 
controlled to reach the exact location. 

The tape read and drive system is shown schematically in Fig. 20. The algorithm used in 
the final stage of approach to the desired portion of the tape or other recording medium may 
incorporate a control employing Fuzzy logic, Neural Networks, mathematical formulae modeling 
the system (differential equations) in a Model-based system, a Proportional-Differential-Integral 
(PID) system, or a controller employing an algorithm of higher order, or other known control 
methods. 

If a selection is to be recorded over, the start and stop locations would be automatically 
determined from the locations already indicated on the tape. Further, this information could be 
stored in memory device (which reads a catalog or index of the tape when a new tape is loaded) 
or non-volatile memory device (which stores information relating to known tapes within the 
device) or both types of memory in the VCR, so that an index function may be implemented in 
the VCR itself, without the need to read an entire tape. Optionally, a printer, such as a thermal 
label printer (available from, e.g. Seiko Instruments, Inc.), attached to the device, could be 
available to produce labels for the tapes, showing the index, so that the contents of a tape may be 
easily indicated. A label on the tape may also include a bar code or two-dimensional coding 
system to store content or characterization information. The stored identification and index 
information is thus stored in a human or machine readable form. 

These contents, or a list of contents, need not necessarily be manually entered by the user 
or created by the apparatus, rather, these may be derived from published data or a database, data 
transmitted to the control, and/or data determined or synthesized by the control itself. For 
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example, broadcast schedules are available in electronic or machine readable form, and this 
information may be used by the apparatus. 

EXAMPLE 3 

SERIAL DATA MEDIUM INDEX 

Another aspect of the present invention relates to the cataloging and indexing of the 
contents of a storage medium. While random access media normally incorporate a directory of 
entries on a disk, and devices such as optical juke boxes normally are used in conjunction with 
software that indexes the contents of the available disks, serial access mass storage devices, such 
as magnetic tape, do not usually employ an index; therefore, the entire tape must be searched in 
order to locate a specific selection. 

In the present invention, an area of the tape, preferable at the beginning of the tape or at 
multiple locations therein, is encoded to hold information relating to the contents of the tape. 
This encoding is shown in Fig. 19, which shows a data format for the information. This format 
has an identifying header 1901, a unique tape identifier 1902, an entry identifier 1903, a start 
time 1904, an end time 1905 and/or a duration 1906, a date code 1907, a channel code 1908, 
descriptive information 1909 of the described entry, which may include recording parameters and 
actual recorded locations on the tape, as well as a title or episode identifying information, which 
may be a fixed or variable length entry, optionally representative scenes 1910, which may be 
analog, digital, compressed form, or in a form related to the abstract characterizations of the 
scenes formed in the operation of the device. Finally, there are error correcting codes 1911 for 
the catalog entry, which may also include advanced block encoding schemes to reduce the affect 
of non-Gaussian correlated errors which may occur on video tape, transmission media and the 
like. This information is preferably a modulated digital signal, recorded on, in the case of Hi-Fi 
VHS, one or more of the preexisting tracks on the tape, including the video, overscan area, 
Audio, Hi-Fi stereo audio, SAP or control tracks. It should be noted that an additional track 
could be added, in similar fashion to the overlay of Hi-Fi audio on the video tracks of Hi-Fi 
VHS. It is also noted that similar techniques could be used with Beta format, 8mm, or other 
recording systems, to provide the necessary indexing functions. 
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Digital data may also be superimposed as pseudonoise in the image information, or as 
other information intermixed or merged with the video information. 

The recording method is preferable a block encoding method with error correction within 
each block, block redundancy, and interleaving. Methods are known for reducing the error rate 
5 for digital signals recorded on unverified media, such as videotape, which are subject to burst 
errors and long term non-random errors. Such techniques reduce the effective error rate to 
acceptable levels. These are known to those skilled in the art and need not be discussed herein in 
detail. A standard reference related to this topic is Digital Communications by John G. Proakis. 
McGraw-Hill (1983). The digital data recording scheme is best determined according to the 
10 characteristics of the recording apparatus. Therefore, if an, e.g. Sony Corporation helical scan 
p recording/reproducing apparatus was employed, one of ordinary skill in the art would initially 
% reference methods of the Sony Corporation initially for an optimal error correcting recording 
J^; scheme, which are available in the patent literature, in the U.S., Japan, and internationally, and 
O the skilled artisan would also review the known methods used by other manufacturers of digital 
yj5 data recording equipment. Therefore, these methods need not be explained herein in detail. 
%. t The catalog of entries is also preferably stored in non-volatile memory, such as hard disk, 

iV associated with the VCR controller. This allows the random selection of a tape from a library, 
fli without need for manually scanning the contents of each tape. This also facilitates the random 
S storage of recordings on tape, without the requirement of storing related entries in physical 
20 proximity with one another so that they may be easily located. This, in turn, allows more 
efficient use of tape, because of reduced empty space at the end of a tape. The apparatus is 
shown schematically in Fig. 20, in which a tape drive motor 2001, controlled by a transport 
control 2002, which in turn is controlled by the control 2003, moves a tape 2005 past a reading 
head 2004. The output of the reading head 2004 is processed by the amplifier/demodulator 2006, 
25 which produces a split output signal. One part of the output signal comprises the analog signal 
path 2007, which is described elsewhere. A digital reading circuit 2008 transmits the digital 
information to a digital information detecting circuit 2009, which in turn decodes the information 
and provides it to the control 2003. 

In order to retrieve an entry, the user interacts with the same interface that is used for 
30 programming the recorder functions; however, the user selects different menu selections, which 
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guide him to the available selections. This function, instead of focusing mainly on the particular 
user's history in order to predict a selection, would analyze the entire library, regardless of which 
user instituted the recording. Further, there would likely be a bias against performing identically 
the most recently executed function, and rather the predicted function would be an analogous 
function, based on a programmed or inferred user preference. This is because it is unlikely that a 
user will perform an identical action repeatedly, but a pattern may still be derived. 

It is noted that the present library functions differ from the prior art VHS tape index 
function, because the present index is intelligent, and does not require the user to mark an index 
location and explicitly program the VCR to shuttle to that location. Rather, the index is content 
based. Another advantage of the present library function is that it can automatically switch 
media and recording format, providing an adaptive and/or multimode recording system. Such a 
system might be used, for example, if a user wishes to record, e.g., "The Tonight Show With 
Johnny Carson" in highly compressed form, e.g. MPEG-2 at 200:1 compression, except during 
the performance of a musical guest, at which time the recording should have a much lower loss, 
e.g., MPEG-2 at 20:1, or in analog format uncompressed. A normal VCR could hardly be used 
to implement such a function even manually, because the tape speed (the analogy of quality 
level) cannot generally be changed in mid recording. The present system could recognize the 
desired special segment, record it as desired, and indicate the specific parameters on the 
information directory. The recorded information may then be retrieved sequentially, as in a 
normal VCR, or the desired selection may be preferentially retrieved. If the interface of the 
present invention is set to automatically record such special requests, the catalog section would 
then be available for the user to indicate which selections were recorded based upon the implicit 
request of the user. Because the interface has the ability to characterize the input and record 
these characterizations in the index, the user may make an explicit request different from the 
recording criteria, after a selection has been recorded. The controller would then search the 
index for matching entries, which could then be retrieved based on the index, and without a 
manual search of the entire tape. Other advantages of the present system are obvious to those of 
ordinary skill in the art. 
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A library system is available from Open Eyes Video, called "Scene Locator", which 
implements a non-intelligent system for indexing the contents of a videotape. See NewMedia, 
November/December 1991, p. 69. 

It is noted that, if the standard audio tracks are used to record the indexing information, 
then standard audio frequency modems and recording/receiving methods are available, adapted 
to record or receive data in half-duplex mode. These standard modems range in speed from 300 
baud to about 64 kilobits per second, e.g. v. 29, v. 17, v. 32, v.32bis, v. 34, v. 90, v. 91, etc. While 
these systems are designed for dial-up telecommunications, and are therefore are designed for the 
limited data rates available from POTS. These are limited to a slower speed than necessary and 
incorporate features unnecessary for closed systems, they require a minimum of design effort and 
the same circuitry may be multiplexed and also be used for telecommunication with an on-line 
database, such as a database of broadcast listings, discussed above. It should be noted that a full- 
duplex modem should be operated in half duplex mode when reading or recording on a media, 
thus avoiding the generation of unnecessary handshaking signals. Alternatively, a full duplex 
receiver may be provided with the resulting audio recorded. A specially programmed receiver 
may extract the data from the recording. DTMF codes may also be employed to stored 
information. 

The Videotext standard may also be used to record the catalog or indexing information on 
the tape. This method, however, if used while desired material is on the screen, makes it difficult 
(but not impossible) to change the information after it has been recorded, without re-recording 
entire frames, because the videotext uses the video channel, during non-visible scan periods 
thereof. The video recording system according to the present invention preferably faithfully 
records all transmitted information, including SAP, VAR, close caption and videotext 
information, which may be used to implement the various functions. 

The use of on-line database listings may be used by the present interface to provide 
information to be downloaded and incorporated in the index entry of the library function, and 
may also be used as part of the intelligent determination of the content of a broadcast. This 
information may further be used for explicitly programming the interface by the user, in that the 
user may be explicitly presented with the available choices available from the database. 
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EXAMPLE 4 

CONTROLLED ENCRYPTION AND ACCOUNTING SYSTEM 
The present invention also allows for scrambling, encryption and locking of source 
material, and the receiving device selectively implements an inverse process or a partial inverse 
5 process for descrambling, decryption or unlocking of the material, much as the Videocipher 
series systems from General Instruments, and the fractal enciphering methods of Entertainment 
Made Convenient 2 Inc. (EMC 2 , and related companies, e.g., EMC 3 , and Iterated Systems, Inc.) 
The present invention, however, is not limited to broadcasts, and instead could implement a 
system for both broadcasts and prerecorded materials. In the case of copying from one tape to 
10 another, such a system could not only provide the herein mentioned library functions of the 
q present invention according to Example 2, it could also be used to aid in copy protection, serial 
^ copy management, and a pay-per-view royalty collection system. 

Such a system could be implemented by way of a telecommunication function 
D incorporated in the device, shown as block 1808 of Fig. 18, or an electronic tag which records 
y)5 user activity relating to a tape or the like. Such tags might take the form of a smart card, 
JL PCMCIA device, or other type of storage device. A royalty fee, etc., could automatically be 
fy registered to the machine either by telecommunication or registry with the electronic tag, 



U allowing new viewer options to be provided as compared with present VCR's. 
5f Numerous digital data encryption and decryption systems are known. These include 

20 DES, "Clipper", elliptic key algorithms, public key/private key (RSA, etc.), PGP, and others. 
Digital encryption allows a sender to scramble a message so that, with an arbitrary degree of 
difficulty, the message cannot be determined without use of a decryption key. 

An encrypted tape or other source material may be decrypted with a decryption key 
available by telecommunication with a communication center, remote from the user, in a 
25 decryption unit, shown schematically as the decrypt unit 1806a of Fig. 18. Such an 

encryption/decryption scheme requires special playback equipment, or at least equipment with 
decryption functionality, and thus any usage or decrypted data may be registered as a result of the 
requirement to receive a decryption key. The decryption unit may be part of an addressable 
remote unit for control of the unit remotely. 
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During acquisition of the electronic decryption key, a VCR device of an embodiment of 
the present invention would indicate its identity or electronic address, and an account is charged 
a fee for such use. The negotiation for the electronic key is also preferably encrypted. In 
addition, the decryption key may be specific for a particular decoder; Such a system could also 
be used for controlled access software, for example for a computer, wherein a remote account is 
charged for use of the software. Information communication may be through the Internet or 
through an on-line service such as America Online or Compuserve. 

Such a system differs from the normal hardware "key" or "dongle" (device which attaches 
to standard hardware port for authentication and usage limitation) because it requires on-line or 
electronic access for an encryption key, which may offer different levels of use. It also differs 
from a call-in registration, because of the automatic nature of the telecommunication. This 
presently described system differs from normal pay-per-view techniques because it allows, in 
certain instances, the user to schedule the viewing. Finally, with an encryption function 
implemented in the VCR, the device allows a user to create and distribute custom "software" or 
program material. In addition, the present controller could then act as the "telecommunication 
center" and authorize decryption of the material. 

If the source signal is in digital form, a serial copy management scheme system is 
preferably implemented. 

The present invention is advantageous in this application because it provides an advanced 
user interface for creating a program (i.e. a sequence of instructions), and it assists the user in 
selecting from the available programs, without having presented the user with a detailed 
description of the programs, i.e., the user may select the choice based on characteristics rather 
than literal description. 

In the case of encrypted program source material, it is particularly advantageous if the 
characterization of the program occurs without charging the account of the user for such 
characterization, and only charging the account if the program is viewed by the user. The user 
may make a viewing decision based on the recommendation of the interface system, or may 
review the decision based on the title or description of the program, or after a limited duration of 
viewing. Security of the system could then be ensured by a two level encryption system, wherein 
the initial decryption allows for significant processing, but not comfortable viewing, while the 
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second level of decryption allows viewing, and is linked to the accounting system. Alternatively, 

the decryption may be performed so that certain information, less than the entirety, is available in 

a first decryption mode, while other information comprising the broadcast information is 

available in a second decryption mode. 

5 The transmission encryption system may be of any type, but for sensitive material, i.e. 

where mere distortion of the material (e.g., loss of synchronization information and phase 

distortion) would be insufficient, an analog multiple subband transform, with spread spectrum 

band hopping and digital encryption of various control signals, would provide a system which 

would be particularly difficult for the user to view without authorization, and could be effectively 

10 implemented with conventionally available technology. The fractal compression and encryption 

f»* of the EMC 2 and Iterated Systems, Inc. system is also possible, in instances where the broadcast 

^ may be precompressed prior to broadcast and the transmission system supports digital data. Of 

Ci course, if a digital storage format is employed, a strict digital encryption system of known type 

Q may be used, such as those available from RSA. The implementation of these encryption 

ijj5 systems is known to those skilled in the art. These may include the National Bureau of 

L Standards (NBS), Verifiable Secret Sharing (VSS) and National Security Agency (NSA) 
O 

HI encryption standards, as well as various proprietary standards. 

I V 

JEj EXAMPLE 5 

20 USER INTERFACE 

In one embodiment of the present invention, the apparatus comprises a program entry 
device for a VCR or other type of media recording system. The human interface element has an 
infrared device to allow wireless communication between the human interface device and the 
VCR apparatus proper. The human interface device also includes a direct-manipulation type 
25 input device, such as a trackball or joystick. Of course it is understood that various known or to- 
be developed alternatives can be employed, as described above. 

It is noted that many present devices, intended for use in computers having graphic 
interfaces, would advantageously make use of an input device which is accessible, without the 
necessity of moving the user's hands from the keyboard. Thus, for example, Electronic 
30 Engineering Times (EET), October 28, 1991, p. 62, discloses a miniature joystick incorporated 
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into the functional area of the keyboard. This technique is directed at a different aspect of user 
interaction with a programmable device than certain preferred embodiments of the present 
invention, in that the input device does not have a minimal number of keys. While the device 
disclosed in EET is intended for use in a full function keyboard, the preferred embodiment of the 
present invention is directed towards the minimization of the number of keys and avoidance of 
superfluous keys by provision of a pointing device. Of course, the present invention could be 
used with a full function input device, where appropriate, and the joystick of EET (10/28/91, p. 
62) would be suitable in this case. 

The interface of the present invention studies the behavior and moods of the user, in 
context, during interactions to determine the expected user level of that user as well as the 
preferences of the user. These user characteristics may change over time and circumstances. 
This means that the system studies the interaction of the user to determine the skill of the user or 
his or her familiarity with the operation and functionality of the system. By determining the skill 
of the user, the system may provide a best compromise. The purpose of this feature is to provide 
a tailored interface adapted to the characteristics of the user, thus adaptively providing access to 
various features in a hierarchical manner such that a most likely feature to be used is more easily 
accessible than an unlikely feature, but that features can generally be accessed from all or most 
user levels. The user level analysis also allows the system to teach the user of the various 
functions available, particularly when it becomes apparent that the user is being inefficient in the 
use of the system to perform a given task. Therefore, the menu structure may also be adaptive to 
the particular task being performed by the user. When combined with the user level analysis 
feature, the user efficiency feature will provide a preferable interface, with reduced learning time 
and increased usability for a variety of users. 

Thus, an important concept is that the system has at least one object having a plurality of 
functions, certain of which are unnecessary or are rarely used for various applications or in 
various contexts, while these are used with greater frequency in other contexts. Further, based 
upon predetermined protocols and learned patterns, it is possible to predict which functions will 
be used and which will not be used. 

Therefore, the system, upon recognizing a context, will reconfigure the availability or 
ease of availability of functions and allow various subsets to be used through "shortcuts". Thus, 
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to some extent, the interface structure may vary from time to time based upon the use of the 
system. The prior art apparently teaches away from this concept, because it is believed to 
prevent standardization, limits the "recordability" of macros and/or instruction sheets for casual 
users and limits the availability of technical support. Each of these can be addressed, to some 
extent by the availability of a default mode (so that users can access all information), and because 
the interface is self-simplifying in case of difficulty. However, forcing all users to always work 
in a default mode limits the improvements in productivity that may be gained by a data-sensitive 
processing system, and hence this standardization for its own sake is rejected by the present 
invention. 

The improvements to be gained by using an intelligent data analysis interface for 
facilitating user control and operation of the system are more than merely reducing the average 
number of keystrokes or time to access a given function. Initial presentation of all available 
information to a new user might be too large an information load, leading to inefficiency, 
increased search time and errors. Rather, the improvements arise from providing a means for 
access of and availability to functions not necessarily known to the user, and to therefore 
improve the perceived quality of the product. 

The system to determine the sophistication of the user includes a number of storage 
registers, for storing an analysis of each act for each user. A given act is represented in a 
plurality of the registers, and a weighting system to ensure that even though an act is represented 
in a number of registers, it is not given undue emphasis in the analysis. Thus, each act of the 
user may be characterized in a number of ways, and each characteristic stored in an appropriate 
register, along with a weighting representing an importance of the particular characteristic, in 
relation to other identified characteristics and in relation to the importance of the act as a whole. 
The act is considered in context, and therefore, the stored information relates to the act, the 
sequence of acts prior to the act, acts of the user occur after the act, the results of the sequence of 
acts which include the act, and characteristics of the user which are not "acts", but rather include 
timing, mouse path efficiency, and an interaction with other users. 

An apparatus for performing a path information or efficiency determining function is 
shown schematically in Fig. 18, and in more detain in Fig. 21. Thus, for example, if a 
characteristic of the user is an unsteady hand while using the cursor control device, e.g. mouse, 
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producing a high frequency or oscillating component, the existence of this characteristic is 
detected and quantified by the high frequency signal component detector 2112, and, depending 
on the amplitude, frequency and duration (e.g. path length), may also be detected by the path 
optimization detector 2105. Once this characteristic is detected and quantified, an adaptive filter 
may be applied by the main control 1806 to selectively remove the detected component from the 
signal, in order to improve the reliability of the detection of other characteristics and to determine 
the intended act of the user. 

It should be noted that the various characteristic filters preferably act in "parallel" at each 
stage of the characteristic recognition, meaning that one characteristic is defined simultaneously 
with the detection of other characteristics, which assists in resolving ambiguities, allows for 
parallel processing by a plurality of processing elements which improves real-time recognition 
speed, and allows a probability-based analysis to proceed efficiently. Such a "parallel" 
computation system is included in a neural net computer, and a hardware-implementation of a 
neural net/fuzzy logic hybrid computer is a preferred embodiment, which allows fuzzy rules to be 
programmed to provide explicit control over the functioning of the system. It is preferred that a 
human programmer determine the basic rules of operation of the system, prior to allowing a 
back-propagation of errors learning algorithm to improve and adapt the operation of the system. 

The adaptive system implemented according to the present invention, by detecting a user 
level, allows a novice user to productively interact with the system while not unnecessarily 
limiting the use of the adaptive interface by an advanced user, who, for example, wishes to move 
the cursor quickly without the limiting effects of a filter which slows cursor response. 

Another example of the use of an adaptive user interface level is a user who repeatedly 
requests "help" or user instructions, through the explicit help request detector 2115, which causes 
an output from the current help level output 2102; such a user may benefit from an automatic 
context-sensitive help system, however such a system may interfere with an advanced user, and 
is unnecessary in that case and should be avoided. This adaptive user interface level concept is 
not limited to a particular embodiment of the present invention, such as a VCR, and in fact, may 
be broadly used wherever a system includes an interface that is intended for use by both 
experienced and inexperienced users. This differs from normal help systems which must be 
specifically requested, or "balloon help" (Apple Computer, Macintosh System 7.0, 7.1, 7.5) 
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which is either engaged or disengaged, but not adaptive to the particular situation based on an 
implicit request or predicted need. In the case of a single user or group of users, the interface 
could maintain a history of feature usage for each user, as in the past user history block 2107, and 
provide a lower user interface level for those features which are rarely used, and therefore less 
familiar to the user, through the current user level output 2101. 

It should be noted that the present system preferably detects an identity of a user, and 
therefore differentiates between different users by an explicit or implicit identification system. 
Therefore, the system may accumulate information regarding users without confusion or 
intermingling. 

EXAMPLE 6 

VCR PROGRAMMING PREFERENCE PREDICTION 

The device according to the present invention is preferably intelligent. In the case of a 
VCR, the user could also input characteristics of the program material that are desired, and 
characteristics of that program material which is not desired. The device would then, over time, 
monitor various broadcast choices, and determine which most closely match the criteria, and thus 
be identified. For example, if the user prefers "talk-shows", and indicates a dislike for "situation 
comedies" ("sitcoms"), then the device could scan the various available choices for 
characteristics indicative of one or the other type of programming, and perform a correlation to 
determine the most appropriate choice(s). A sitcom, for example, usually has a "laugh track" 
during a pause in normal dialogue. The background of a sitcom is often a confined space (a 
"set"), from different perspectives, which has a large number of "props" which may be common 
or unique. This set and the props, however, may be enduring over the life of a show. 

A talk-show, on the other hand, more often relies on actual audience reaction (possibly in 
response to an "applause" sign), and not prerecorded or synthesized sounds. The set is simple, 
and the broadcast often shows a head and neck, or full body shot with a bland background, likely 
with fewer enduring props. A signal processing computer, programmed for audio and/or video 
recognition, is provided to differentiate between at least the two types with some degree of 
efficiency, and with a possibly extended sampling time, have a recognition accuracy, such that, 
when this information is integrated with other available information, a reliable decision may be 
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made. The required level of reliability, of course, will depend on the particular application and a 
cost-benefit analysis for the required system to implement the decision-making system. 

Since the system according to the present invention need not display perfect accuracy, the 
preferred embodiment according to the present example applies general principles to new 
situations and receives user or other feedback as to the appropriateness of a given decision. 
Based on this feedback, subsequent encounters with the same or similar data sets will produce a 
result which is "closer" to an optimal decision. Therefore, with the aid of feedback, the search 
criterion would be improved. Thus, a user could teach the interface through trial and error to 
record the desired broadcast programs. Thus, the presently described recognition algorithms may 
be adaptive and learning, and need not apply a finite set of predetermined rules in operation. For 
such a learning task, a neural network processor may be implemented, as known in the art. 

The feature extraction and correlation system according to the present invention is shown 
in Fig. 22. In this figure, the multimedia input, including the audio signal and all other available 
data, are input in the video input 2201. The video portion is transferred to a frame buffer 2202, 
which temporarily stores all of the information. All other information in the signal, including 
audio, VIR, videotext, close caption, SAP (second audio program), and overscan, is preferably 
stored in a memory, and analyzed as appropriate. The frame buffer 2202 may have an integral or 
separate prefiltering component 2203. The filtered signal (s) are then passed to a feature extractor 
2204, which divides the video frame into a number of features, including movement, objects, 
foreground, background, etc. Further, sequences of video frames are analyzed in conjunction 
with the audio and other information, and features relating to the correlation of the video and 
other information, e.g., correlation of video and audio, are extracted. Other information is also 
analyzed and features extracted, e.g., audio and close caption. All extracted features relating to 
the multimedia input are then passed to a transform engine or multiple engines in parallel, 2205. 
These transform engines 2205 serve to match the extracted features with exemplars or standard 
form templates in the template database 2206. 

It should be noted that even errors or lack of correlation between certain data may provide 
useful information. Therefore, a mismatch between audio and close caption or audio and SAP 
may be indicative of useful information. For non-video information, exemplars or templates are 
patterns which allow identification of an aspect of the signal by comparing the pattern of an 
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unidentified signal with the stored pattern. Thus, the voice patterns of particular persons and 
audio patterns of particular songs or artists may be stored in a database and employed to identify 
a source signal. 

The transformed extracted features and the templates are then correlated by a correlator or 
correlators 2207. The parallelization of implementation of the transforms and correlators serves 
to increase the recognition speed of the device. It should be understood that appropriate systems 
for parallelization are known in the art. For example, the TMS 320C80, also known as the TI 
MVP (Texas Instruments multimedia video processor) contains four DSP engines and a RISC 
processor with a floating point unit on a single die. A board including a TMS 320C80 is 
available from General Imaging Corp., Billerica MA, the S/IP80, which may be programmed 
with ProtoPIPE. In addition, a board including a TMS 320C80 is also available from Wintriss 
Engineering Corp., San Diego, CA. Multiple MVP processors may also be parallelized for 
additional computing power. The MVP may be used to analyze, in parallel, the multimedia input 
signal and correlate it with stored patterns in a database. In this context, correlation does not 
necessarily denote a strict mathematical correlation, but rather indicates a comparison to 
determine the "closeness" of an identified portion of information with an unidentified portion, 
preferably including a reliability indicator as well. For neural network-based processing, specific 
hardware accelerators also available, such as from Nestor, Inc. and Intel. Therefore, since there 
may be multiple recognizable aspects of the unidentified data, and various degrees or genericness 
of the characteristic recognized, it is preferred that at this initial stage of the recognition process 
that the output of the correlators 2207 be a data set, e.g. a matrix, series of pointers, or other 
arrangement, so that sufficient information is available for higher level processing to allow 
application of an appropriate decision process. Of course, if the characteristic to be detected is 
simple and well defined, and the decision-making process may be implemented with a simple 
correlation result, then a complex data set output is not required. In fact, the output of the 
correlator may have a number of different forms, based on the context of the recognition process. 

If, for example, an exact match to an entire frame is sought, partial match information is 
not particularly useful, and is ignored in this process. (Of course, since the system is "self- 
learning", the processing results may be maintained and analyzed for other purposes). If the 
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system, on the other hand, is analyzing novel data, a full analysis would likely be necessary 
including partial results and low correlation results. 

The outputs of the correlators are input into an adaptive weighing network 2208, to 
produce a probability of a match between a given feature and a given template. The recognition 
5 is completed in an identifier 2209, which produces a signal identifying one or more objects in the 
video frame input. The identifier 2209 also has an output to the template database 2206, which 
reinforces the recognition by providing feedback; therefore, if the same object appears again, it 
will be more easily recognized. The template database 2206 therefore also has an input from the 
feature extractor 2204, which provides it with information regarding the features recognized. It 
10 is also noted that, in addition to allowing recognition, the parallel transform engines 2205, 
n correlators 2207, and adaptive weighing network 2208 also allows the system to ignore features 
4i that, though complex, do not aid in recognition. 

=£=X 

For example, during dialogue, the soundtrack voice may correlate with the mouth 
C movements. Thus, the mouth movements aid little in recognition, and may be virtually ignored, 
yl5 except in the case where a particular person's mouth movements are distinctive, e.g., J.im Nabors 

^ ("Gomer Pyle"), and Tim Curry ("Rocky Horror Picture Show"). Thus, the complexity and 

LJ 

\U parallelism in the intermediate recognition stages may actually simplify the later stages by 
E3 

nj allowing more abstract features to be emphasized in the analysis. Animation poses a special 
t=L example where audio and image data may be separated, due to the generally non-physiologic 
20 relation between the image and soundtrack. 

The pattern recognition function of the present invention could be used, in a VCR 
embodiment according to the present invention to, e.g., to edit commercials out of a broadcast, 
either by recognition of characteristics present in commercials, in general, or by pattern 
recognition of specific commercials in particular, which are often repeated numerous times at 
25 various times of the day, and on various broadcast channels. Therefore, the system may acquire 
an unidentified source. signal, which may be, for example, a 30 second segment, and compare 
this with a database of characteristics of known signals. If the signal does not match any 
previously known or identified signals, it is then subject to a characterization which may be the 
same or different than the characterization of the identified signals. The characterizations of the 
30 unidentified signal are then compared to characteristics to be recognized. If the unidentified 
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signal meets appropriate criteria, a presumptive generic characterization is made. This 
characterization is preferably confirmed by a user later, so that a positively identified signal is 
added to the database of identified signals; however, under certain circumstances no confirmation 
is required. 

Certain media present a recognizable audio or video cue when a commercial break has 
ended. (E.g. often sports events, such as the Olympic Games, will have theme music or 
distinctive images). The present device need not respond immediately to such cues, and may 
incorporate a delay, which would store the information while a decision is being made. In the 
case of a video tape, the delay may be up to the time between the time of recording and the time 
of playback. Further, the temporary storage medium may be independent of the pattern 
recognition system. Thus, a system provided according to the present invention may actually 
include two independent or semi-independent data streams: the first serving as the desired signal 
to be stored, retaining visually important information, and the second providing information for 
storage relating to the pattern recognition system, which retains information important for the 
recognition process, and may discard this information after the pattern recognition procedure is 
complete. 

A system which provides a plurality of parallel data streams representing the same source 
signal may be advantageous because is allows a broadcast quality temporary storage, which may 
be analog in nature, to be separate from the signal processing and pattern recognition stage, 
which may be of any type, including digital, optical, analog or other known types, which need 
only retain significant information for the pattern recognition, and therefore may be highly 
compressed (e.g. lossy compression), and devoid of various types of information which are 
irrelevant or of little importance to the pattern recognition functions. Further, the temporary 
storage may employ a different image compression algorithm, e.g. MPEG-4, MPEG-2 or MPEG- 
1, which is optimized for retention of visually important information, while the recognition 
system may use a compression system optimized for pattern recognition, which may retain 
information relevant to the recognition function which is lost in other compression systems, 
while discarding other information which would be visually important. Advantageously, 
however, the analysis and content transmission streams are closely related or consolidated, such 
as MPEG-7 and MPEG-4. 
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In a particularly advantageous arrangement, the compression algorithm is integral to the 
recognition function, preparing the data for the pattern matching and characterization, and 
therefore is optimized for high throughput. According to this embodiment, the initial 
compression may include redundant or uncompressed information, if necessary in order to 
achieve real-time or near real-time recognition, and, thus may actually result in a larger 
intermediate data storage requirement than the instantaneous data presented to the recognition 
system; however, the term "compression", in this case, applies to the long term or steady state 
status of the device, and in a real-time recognition function, the amount of data stored for use in 
recognition is preferably less than the cumulative amount of data presented, except during the 
very initial stages of data acquisition and possibly rare peaks. 

In the case where a high quality (low loss, e.g. broadcast quality) intermediate storage is 
employed, after a decision is made as to whether the data should be stored permanently or 
otherwise further processed or distributed, the data may be transferred to the appropriate system 
or subsystem of the apparatus. Alternatively, the high quality intermediate storage is retained, 
and no further processing is performed. In either case, the purpose of this storage is to buffer the 
source data until the computational latency resolves any decisions that must be made. 

According to one aspect of the present invention, the source image may be compressed 
using the so called "fractal transform", using the method of Barnsley and Sloan, which is 
implemented and available as a hardware accelerator in product form from Iterated Systems, Inc., 
Norcross, GA, as the Fractal Transform Card (FTC) II, which incorporates eight fractal transform 
integrated circuit chips, 1 MByte of Random Access Memory (RAM), and an Intel i80960CA-25 
□ P, and operates in conjunction with P. OEM™ (Iterated Systems, Inc., Norcross, GA) software, 
which operates under MicroSoft-Disk Operating System (MS-DOS). FTC-II hardware 
compression requires approximately 1 second per frame, while software decompression on an 
Intel 80486-25 based MS-DOS computer, using "Fractal Formatter" software, can be performed 
at about 30 frames per second, which allows approximately real time viewing. The Fractal 
Video Pro 1.5 is a video codec for WIN, allowing software only playback at 15-30 fps, 70-150 
Kbytes/sec. This is a non-symmetrical algorithm, requiring more processing to compress than to 
decompress the image. The FTC-IV Compression Accelerator Board is presently available. 
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This fractal compression method potentially allows data compression of upwards of 
2000:1, while still maintaining an aesthetically acceptable decompressed image result. Further, 
since the method emphasizes structural aspects of the image, as opposed to the frequency 
decomposition used in DCT methods (JPEG, MPEG), elements of the fractal method could be 
used as a part of the image recognition system. Of course, it should be appreciated that other 
fractal processing methods are available and may be likewise employed. 

Audio data is also compressible by means of fractal transforms. It is noted that the audio 
compression and image recognition functions cannot be performed on the FTC-II board, and 
therefore an alternate system must be employed in order to apply the pattern recognition aspects 
of the present invention. It should also be noted that an even more efficient compression-pattern 
recognition system could be constructed by using the fractal compression method in conjunction 
with other compression methods, which may be more efficient under certain circumstances, such 
as discrete cosine transform (DCT), e.g. JPEG or modified JPEG or wavelet techniques. Fractal 
compression systems are also available from other sources, e.g. the method of Greenwood et al., 
Netrologic Inc., San Diego, CA. See also, Shepard, J.D., "Tapping the Potential of Data 
Compression", Military and Aerospace Electronics, May 17, 1993, pp. 25-27. 

A preferred method for compressing audio information includes a model-based 
compression system. This system may retain stored samples, or derive these from the data 
stream. The system preferably also includes high-level models of the human vocal tract and 
vocalizations, as well as common musical instruments. This system therefore stores information 
in a manner which allows faithful reproduction of the audio content and also provides emphasis 
on the information-conveying structure of the audio signal. Thus, a preferred compression for 
audio signals retains, in readily available form, information important in a pattern recognition 
system to determine an abstract information content, as well as to allow pattern matching. Of 
course, a dual data stream approach may also be applied, and other known compression methods 
may be employed. 

Because of the high complexity of describing a particular signal pattern or group of audio 
or image patterns, in general, the system will learn by example, with a simple identification of a 
desired or undesired pattern allowing analysis of the entire pattern, and extraction of 
characteristics thereof for use in preference determination. 
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Barnsley and Sloan's method for automatically processing digital image data consisting 
of image information, disclosed in U.S. Patents 5,065,447 and 4,941,193, both expressly 
incorporated herein by reference, consists of the steps of storing the image data in the data 
processor, then generating a plurality of uniquely addressable domain blocks from the stored 
image data, each of the domain blocks representing a different portion of the image information 
such that all of the image information is contained in at least one of the domain blocks. A 
plurality of uniquely addressable mapped range blocks corresponding to different subsets of the 
stored image data are created, from the stored image data, with each of the subsets having a 
unique address. This step includes the substep of executing, for each of the mapped range 
blocks, a corresponding procedure upon the one of the subsets of the stored image data that 
corresponds to the mapped range block. Unique identifiers are then assigned to corresponding 
ones of the mapped range blocks, each of the identifiers specifying for the corresponding mapped 
range block a procedure and a address of the corresponding subset of the stored image data. For 
each of the domain blocks, the one of the mapped range blocks that most closely corresponds 
according to predetermined criteria is selected. Finally, the image information is represented as a 
set of the identifiers of the selected mapped range blocks. This method allows a fractal 
compression of image data. In particular, Drs. Barnsley and Sloan have optimized the match of 
the domain blocks with the mapping region by minimizing the Hausdorff distance. A 
decompression of the data precedes analogously in reverse order starting with the identifiers and 
the mapping regions to produce a facsimile of the original image. This system is highly 
asymmetric, and requires significantly more processing to compress than to decompress. 
Barnsley and Sloan do not suggest a method for using the fractal compression to facilitate image 
recognition, which is a part of the present invention. 

Basically, the fractal method proceeds from an understanding that real images are made 
up of a plurality of like subcomponents, varying in size, orientation, etc. Thus, a complex block 
of data may be described by reference to the subcomponent, the size, orientation, etc. of the 
block. The entire image may thus be described as the composite of the sub-images. This is what 
is meant by iterative function systems, where first a largest block is identified, and the pattern 
mapping is repetitively performed to describe the entire image. 
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The Iterated Systems, Inc. FTC-II or FTC-IV board, if applied as a part of a system 
according to the present invention, is preferably used in conjunction with a frame-grabber board, 
such as Matrox, Quebec, Canada, Image-LC board, or a Data Translation DTI 451, DT2651, 
DT2862, DT2867, DT2861 or DT2871, which may perform additional functions, such as 
preprocessing of the image signal, and may be further used in conjunction with an image 
processing system, such as the Data Translation DT2878. Of course, it should be understood that 
any suitable hardware, for capturing, processing and storing the input signals, up to and including 
the state of the art, may be incorporated in a system according to the present invention without 
exceeding the scope hereof, as the present invention is not dependent on any particular 
subsystem, and may make use of the latest advances. For example, many modern systems 
provide appropriate functionality for digital video capture, either uncompressed, mildly 
compressed, or with a high degree of compression, e.g., MPEG-2. 

The Texas Instruments TMS320C80 provides a substantial amount of computing power 
and is a preferred processor for certain computationally intensive operations involving digital 
signal processing algorithms. A system employing a parallel TMS 320C40 processors may also 
be used. The Intel Pentium series (or related processors from AMD, National Semiconductor, or 
other companies), DEC/Compaq Alpha, SPARC, or other processors intended for desktop 
computing may, either individually or in multiprocessor configurations, be used to process 
signals. 

A pattern recognition database system is available from Excalibur Technologies, San 
Diego, CA. Further, IBM has had pattern recognition functionality available for its DB/2 
database system, and has licensed Excalibur's XRS image retriever recognition software for 
DB/2. See, Lu, C, "Publish It Electronically", Byte, September 1993, pp. 94-109. Apple 
Computer has included search by sketch and search by example functions in PhotoFlash 2.0. See 
also, Cohen, R., "FullPixelSearch Helps Users Locate Graphics", MacWeek, August 23, 1993, p. 
77. 

Image processing hardware and systems are also available from Alacron, Nashua NH; 
Coreco, St. Laurent, Quebec; Analogic, and others. 

A fractal-based system for real-time video compression, satellite broadcasting and 
decompression is also known from Iterated Systems, Inc. and Entertainment Made Convenient 2 , 
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Inc. (EMC 2 ). In such a system, since the compressed signal is transmitted, the remote receiving 
system need not necessarily complete decompression prior to the intelligent pattern recognition 
function of the present invention. This system also incorporates anti-copy encryption and royalty 
and accounting documentation systems. It is noted that the EMC 2 system does not incorporate 
5 the intelligent features of the present invention. 

A preferred fractal-based system according to the present information provides the source 
data preprocessed to allow easy and efficient extraction of information. While much 
precharacterization information may be provided explicitly, the preferred system allows other, 
unindexed information to also be extracted from the signal. Further, the preferred system 
10 provides for an accounting system that facilitates pay-per-view functions. Thus, the interface of 
the present invention could interact with the standard accounting system to allow royalty-based 

y 

-JS recording or viewing, and possibly implement a serial-copy recording prevention system. Prior 
Jh art systems also require a user to explicitly select a program, rather than allow an intelligent 
«!; system to assist in selection and programming of the device. The EMC system is described in 
'~%5 "EMC 2 Pushes Video Rental By Satellite", Electronic Engineering Times, December 2, 1991, 

i„„3, 

£ p.l, p. 98. See also, Yoshida, J., "The Video-on-demand Demand", Electronic Engineering 
r! Times, March 15, 1993, pp. 1, 72. 

Cj Fractal techniques may be used to store images on a writable mass storage medium, e.g. 

i y 

D CD-ROM compatible. The present system may thus be used to selectively access data on the 
~20 CD-ROM by analyzing the images, without requiring full decompression of the image data. 

Wavelets hold promise for efficiently describing images (i.e., compressing the data) while 
describing morphological features of the image. However, in contrast to wavelet transforms that 
are not intended to specifically retain morphological information, the selection of the particular 
wavelet and the organization of the algorithm will likely differ. In this case, the transform will 
25 likely be more computationally complex and therefore slower, while the actual compression 
ratios achieved may be greater. 

Thus, one embodiment of the device according to the present invention may incorporate a 
memory for storing a program, before being transferred to a permanent storage facility, such as 
tape. Such a memory may include a hard disk drive, magnetic tape loop, a rewritable optical disk 
30 drive, or semiconductor memories, including such devices as wafer scale memory devices. This 
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is shown diagrammatically as the intermediate storage 2210 of Fig. 22. The capacity of such a 
device may be effectively increased through the use of image data compression, which may be 
proprietary or a standard format, i.e. MPEG-1, MPEG-2 (Motion Picture Experts Group standard 
employing DCT encoding of frames and interframe coding), MPEG-4 (Motion Picture Experts 
5 Group standard employing DCT encoding of frames and interframe coding, as well as model- 
based encoding methods) JPEG (Joint Photographic Experts Group standard employing DCT 
encoding of frames), Px64 (Comite Consultatif International des Telegraph et telephone 
(International telegraph and telephone consultative committee) (CCITT) standard H.261, 
videoconferencing transmission standard), DVI (Digital Video Interactive), CDI (Compact Disk 
10 Interactive), etc. 

^ Standard devices are available for processing such signals, available from 8x8, Inc., C- 

y 

€l Cube, Royal Philips Electronics (TriMedia), and other companies. Image processing algorithms 

£=■' 

yp may also be executed on general purpose microprocessor devices. 

p! Older designs include the Integrated Information Technology, Inc. (IIT, now 8x8, Inc.) 

^45 Vision Processor (VP) chip, Integrated Information Technology Inc., Santa Clara, CA, the 
I C-Cube CL550B (JPEG) and CL950 (MPEG decoding), SGS-Thompson STI3220, STV3200, 
frj STV3208 (JPEG, MPEG, Px64), LSI Logic L64735, L64745 and L64765 (JPEG) and Px64 chip 

sets, and the Intel Corp. i750B DVI processor sets (82750PB, 82750DB). Various alternative 
3 image processing chips have been available as single chips and chip sets; in board level products, 
20 such as the Super Motion Compression and Super Still-Frame Compression by New Media 
Graphics of Billerica, MA, for the Personal Computer-Advanced technology (PC-AT, an IBM 
created computer standard) bus; Optibase, Canoga Park, CA (Motorola Digital Signal Processor 
(DSP) with dedicated processor for MPEG); NuVista+ from Truevision (Macintosh video 
capture and output); New Video Corp. (Venice, CA) EyeQ Delivery board for Macintosh NuBus 
25 systems (DVI); Intel Corp. ActionMedia II boards for Microsoft Windows and IBM OS/2 in 
Industry Standard Adapter (ISA, the IBM-PC bus standard for 8 (PC) or 16 bit (PC-AT) slots); 
Micro Channel Architecture (MCA) (e.g., Digital Video Interactive (DVI), Presentation Level 
Video (PLV) 2.0, Real Time Video (RTV) 2.0) based machines; and as complete products, such 
as MediaStation by VideoLogic. 



n ■ 
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Programmable devices, including the Texas Instruments TMS320C80 MVP (multimedia 
video processor) may be used to process information according to standard methods, and further 
provide the advantage of customizability of the methods employed. Various available DSP chips, 
exemplary board level signal processing products and available software are described in more 
5 detail in "32-bit Floating-Point DSP Processors", EDN, November 7, 1991, pp. 127-146. The 
TMS320C80 includes four DSP elements and a RISC processor with a floating point unit. 

It is noted that the present interface does not depend on a particular compression format 
or storage medium, so that any suitable format may be used. The following references describe 
various video compression hardware: Kim, Y., "Chips Deliver Multimedia", Byte, December 
10 1991, pp. 163-173; and Donovan, J., "Intel/IBM's Audio-Video Kernel", Byte, December, 1991, 
pp. 177-202. 

G 

yy It should also be noted that the data compression algorithm applied for storage of the 

IE received data may be lossless or lossy, depending on the application. Various different methods 

and paradigms may be used. For example, DCT (discrete cosine transform) based methods, 
Hs wavelets, fractals, and other known methods may be used. These may be implemented by 

I—I:. 

various known means. A compressed image may also be advantageously used in conjunction 

if- with the image recognition system of the present invention, as described above. In such a case, 
i y 

C the compression system would retain the information most important in the recognition function, 

fy 

p and truncate the unimportant information. 

*2o A further method of performing pattern recognition, especially of two dimensional 

patterns, is optical pattern recognition, where an image is correlated with a set of known image 
patterns represented on a hologram, and the product is a pattern according to a correlation 
between the input pattern and the provided known patterns. Because this is an optical technique, 
it is performed nearly instantaneously, and the output information can be reentered into an 
25 electronic digital computer through optical transducers known in the art. Such a system is 

described in Casasent, D., Photonics Spectra, November 1991, pp. 134-140. See also references 
cited therein. 

These optical recognition systems are best suited to applications where an 
uncharacterized input signal frame is to be compared to a finite number of visually different 
30 comparison frames (i.e., at least one, with an upper limit generally defined by the physical 
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limitations of the optical storage media and the system for interfacing to the storage media), and 
where an optical correlation will provide useful information. Thus, if a user wished to detect one 
of, e.g., "David Letterman", "Jay Leno", or "David Koppel", a number of different planar views, 
or holograms in differing poses, of these persons would be formed as a holographic correlation 
5 matrix, which could be superimposed as a multiple exposure, stacked in the width dimension, or 
placed in a planar matrix, side by side. The detection system produces, from the uncharacterized 
input image and the holographic matrix, a wavefront pattern that is detectable by photonic 
sensors. 

It is preferred that if multiple holographic images of a particular characterization are 
10 employed, that they each produce a more similar resulting wavefront pattern than the holographic 
images of other characterizations, in order to enhance detection efficiency. The optical pattern 

G 

yp recognition method is limited in that a holographic image must be prepared of the desired pattern 
^f; to be detected, and that optically similar images might actually be of a different image, if the 
J? differences are subtle. However, this method may be used in conjunction with electronic digital 

pattern recognition methods, to obtain the advantages of both. Methods are also known to 
1 electronically write an image to a holographic storage medium, thereby facilitating its use in a 
Jrj general-purpose image recognition system. Of course, the system may also be used to identify 

0 talk show guests, such as "Richard Gere" or "Cindy Crawford", or these same individuals in 

1 y 

O other contexts. The system may also be used for censoring, for example, to prevent minors from 
"70 viewing adult-oriented material. This system may allow partial censoring, based on the actual 
viewed or spoken content, rather than the entire show. 

If image compression is used, once an image is compressed, it need not be decompressed 
and returned to pixel, NTSC or other standard transmission or format for storage on tape, and 
thus the compressed image information may be stored in the same format as is present in the 
25 temporary storage medium. Thus, the block labeled intermediate processing 221 1 of Fig. 22 
shows that the intermediate storage need not retain the information as received from the frame 
buffer 2202, and in fact, may prepare it for the feature extractor 2204. In addition, the storage 
medium itself need not be normal videotape (S-VHS, VHS, Beta, 8mm, Hi-8) and may be an 
adapted analog storage technique or a digital storage technique. Various magneto-optical 
30 recording techniques are known, which can store between 128 MB (3W 1 ) and around 5 GB (11"), 
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uncompressed, which might be suitable for storing compressed digital or analog information. 
Multilayer CD-ROM and short wavelength (e.g., blue) laser systems allow storage densities of 
about 3.5 to 10 Gbytes per disk, allowing storage of over two hours of MPEG-2 encoded video. 
It is also noted that the present technology could also be applied to any sort of mass 
5 storage, such as for a personal computer. In such a case, a characteristic of the computer file, 
which is analogous to the broadcast program in temporary storage of a VCR, is classified 
according to some criteria, which may be explicit, such as an explicit header or identifying 
information, or implicit, such as a document in letter format, or a memorandum, as well as by 
words and word proximity. In particular, such a recognition system could differentiate various 
10 clients or authors based on the content of the document, and these could be stored in different 
manners. The text analysis system of a text-based computer storage system is analogous to the 
yp program classification system of the VCR embodiment of the present invention. However, there 
Jk is a further analogy, in that the VCR could incorporate optical character recognition of text 

displayed in the program material, employ voice recognition, or directly receive text information 

a : 

Nt5 as a part of a closed caption or videotext system. Thus, the VCR device according to the present 
L invention could recognize and classify programs based on textual cues, and make decisions based 
'T\ on these cues. This might also provide a simple method of discriminating program material, for 
O example, if a commercial does not include close caption or Second Audio Program (SAP), while 

nJ 

O the desired program does, or vice versa, then a commercial could be discriminated from a 
^0 program with very little computational expenditure. 

EXAMPLE 7 
VCR INTERFACE 

A particular VCR interface system according to one aspect of the present invention 
25 includes an internal clock, four program memory, and the capability to display a graphical color 
interface. By providing the user with the aforementioned features, this design is a unique 
implementation for an instrument to be used for programming an event driven controller via an 
interactive display. All information that the user needs is displayed on the screen to avoid or 
minimize the unnecessary searching for information. This information includes the current date 
30 and current time. 
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A simulation of the AKAI Inc. VCR VS303U (on-screen programming) and the interface 
of the present invention, were tested to evaluate users' performances. The AKAI interface of the 
prior art, hereinafter referred to as the prior art interface, was chosen because users made the 
fewest errors while using this machine, and no user quit while programming, as compared to 
three other VCRs tested, a Panasonic (made by Matsushita, Inc.) PV4962 (Bar Coder), an RCA 
brand (formerly Radio Corporation of America, Inc.) VKP950 (on-screen programming), 
Panasonic brand (made by Matsushita Inc.) PV4700 (Display Panel). 

The present embodiment was constructed and tested using HyperPAD™, a rapid 
prototyping package for an IBM-PC Compatible Computer. It is, of course obvious that the 
present embodiment could be incorporated in a commercial VCR machine by those skilled in the 
art, or be implemented on many types of general purpose computers with output screens which 
allow on-screen feedback for the programming operation. Further, the system of the present 
embodiment can include a remote-control device which communicates with a VCR through an 
infrared beam or beams, and can thus exert control over an infrared remote controlled VCR, or 
translate the programming information and communicate through an infrared remote control, 
using the standard type infrared transmitter. 

An IBM PC-AT compatible (MS-DOS, Intel 80286-10 MHz) computer was used to test 
the two simulations. In order to simulate the use of a remote control device in programming the 
VCR, an infrared device made by NView™ was attached to the computer. This device came 
with a keyboard that was used to "teach" a Memorex™ Universal Remote so that the desired 
actions could be obtained. By using a universal remote, the computer could be controlled by 
using a remote control. 

The present embodiment incorporates a mouse input device. It is understood that a small 
trackball with a button for selection, mounted on a remote control may also be employed, and 
may be preferable in certain circumstances. However, a computer mouse is easily available, and 
the mouse and trackball data are essentially similar for the type of task implemented by the user, 
with trackball performance being slightly faster. For daily use on a VCR however, a trackball 
would be a more preferable input device because it does not require a hard, flat surface, which is 
not always available to a user when programming a VCR, such as in the situation where a person 
is watching television while sitting in a chair or sofa. 
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A Genius rM Mouse was used as the input device in the prototype of the interface of the 
present invention. With the mouse, the user could view all of the choices at once on the display 
screen, and then make a selection from the items on the screen by moving the cursor and then 
pressing the left mouse button. 

The interface of the present example focuses on attending to the user's needs, and the 
interface must be modified for each application. By reducing the searching, learning times, and 
entry times, the mental load is also minimized. Some tradeoffs are necessary as a result of 
subjective and objective data. Because of the difficulty in optimizing a single interface design 
for all levels of users, a menu system was used in an attempt to satisfy all these user types. 

The interface of the present example reduced the number of incorrect recordings by 50%. 
The severity of the errors is unimportant here because one wrong entry will cause an 
irretrievable mistake and the user will not record the intended program. One study reported that 
faulty inputs, which lead to missing the program, can be reported by almost every present day 
owner of a VCR. 

EXAMPLE 8 

PROGRAMMABLE DEVICE INTERFACE 

It is also noted that the interface of the present invention need not be limited to 
audio-visual and multimedia applications, as similar issues arise in various programmable 
controller environments. Such issues are disclosed in Carlson, Mark A., "Design Goals for an 
Effective User Interface", Electro/82 Proceedings, 3/1/1-3/1/4; Kreifeldt, John, "Human Factors 
Approach to Medical Instrument Design", Electro/82 Proceedings, 3/3/1-3/3/6; Wilke, William, 
"Easy Operation of Instruments by Both Man and Machine", Electro/82 Proceedings, 
3/2/1-3/2/4; Green, Lee, "Thermo Tech: Here's a common sense guide to the new thinking 
thermostats", Popular Mechanics, October 1985, 155-159; Moore, T.G. and Dartnall, "Human 
Factors of a Microelectronic Product: The Central Heating Timer/Programmer", Applied 
Ergonomics, 1983, Vol. 13, No.l, 15-23; and "The Smart House: Human Factors in Home 
Automation", Human Factors in Practice, Dec. 1990, 1-36. 

This generalized system is shown in Fig. 23, in which the sensor array 2301 interfaces 
with a microprocessor 2302 with a serial data port 2302a, which transmits sensor data to a 
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control 2303. The control 2303 further interfaces or includes a data pattern recognition system 
2304 and an interface and programming console 2305 according to the present invention, using 
the aforementioned intelligent features and adaptive pattern recognition techniques. The control 
2203 controls the plant 2306, which includes all the controlled actuators, etc. 

EXAMPLE 9 

ADAPTIVE GRAPHIC INTERFACE 

A "smart screen" aspect according to the present invention is further explored in the 
present example. This aspect of the present invention allows the interface to anticipate or predict 
the intent of the user, to provide, as a default user choice, the most likely action to be taken by 
the user of the programmable device as a default, which may be either accepted or rejected by the 
user, without inordinate delay to the user. The intelligent selection feature may also 
automatically choose an option and execute the selected option, without further intervention, in 
cases where little or no harm will result. Examples of such harm include a loss of data, a 
substantial waste of the user's time and an inappropriate unauthorized allocation of 
computational resources. 

When a user regularly applies the VCR device, for example, to record a particular 
television show which appears weekly on a given television channel, at a given time, on a given 
channel, such an action could be immediately presented to the user as a first option, without 
forcing him to explicitly program the entire sequence. Likewise, if the user has already entered 
such a command, the presented choices could include a second most likely selection, as well as 
the possibility of canceling the previously entered command. 

Further, if an entire television programming guide for a week or month is available as a 
database, the interface could actively determine whether the desired show is preempted, a repeat 
(e.g., one which has been previously recorded by the system), changed in time or programming 
slot, etc. Thus, the interface could present information to the user, of which he might not be 
aware, and/or predict an action based on that information. Such a device could, if set in a mode 
of operation that allows such, automatically execute a sequence of instructions based on a 
predicted course of action. Thus, if a user is to be absent for a period, he could set the machine 
to automatically record a show, even if the recording parameters are not known with precision at 
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the time of setting by the user. Of course, this particular embodiment depends on the availability 
of a database of current broadcast schedules, however, such a database may generally be 
available, e.g., in an on-line database or broadcast data stream. 

Such an on-line database system of known type may be used and need not be described in 
5 detail herein. Alternately, a printed schedule of broadcasts may be scanned into a computer and 
the printed information deciphered (e.g., OCR) to gain access to a database. Other methods may 
also be used to access scheduling information, e.g. Internet database, access channels on cable 
systems, dial-up services, as well as other broadcast information identifying future and imminent 
programming. Together, these methods allow semiautonomous operation, guided by 
10 programming preferences rather than explicit programs, where such explicit instruction is absent. 
% For example, Gemstar broadcasts video program guides during the video blanking interval of 
+' certain broadcasts, e.g., NBC affiliates. TiVo and Replay Networks each rely on a dial-up 
Nl database to transmit electronic program guide information. Gemstar has proposed use of a 900 
SI MHz paging network to deliver electronic program guide information, as well as low bandwidth 
* 15 uplink information. 

jr; : The smart screens according to the present invention may be implemented as follows. 

O The controller may be, for example, a Microsoft Windows 95/98/ME/NT/2000 operating system 
pi personal computer, for example having a 600 MHz Intel Pentium III or AMD Athlon processor. 
^ ! The display screen interface as described above, according to the present invention, may be 
20 generated using Visual Basic™ 6 or JAVA (executing under the Java Virtual Machine). Video 
information is preferably stored in MPEG 2 format, due to the existing hardware and software 
codec support for this standard. However, alternative video compression formats may be 
employed, for example using wavelet, "fractal", or other techniques. The user input device is, 
for example, a USM port mouse or trackball device, as is well known. The display is, for 
25 example, an VESA standard video graphics display adapter which supports hardware or software 
MPEG 2 display, on for example a 20" color monitor. Presently, such hardware is typical for 
home computers and frequently found in office computers. 

The various parameters concerning the use of the interface are stored in the computer's 
memory, and a non-volatile mass storage device, such as a hard disk drive. Alternately, 
30 Electrically Erasable Programmable read Only Memory (EEPROM) or Erasable Programmable 
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Read Only Memory (EPROM), as well as battery backed Random Access Memory (RAM) could 
also be used. Advantageously, the hard disk supports apparent simultaneous reads and writes, 
meaning, with the available buffer, and at MPEG 2 data rates, the system is able to provide real 
time performance for simultaneous read and write tasks. According to various embodiments, 
5 three or more simultaneous tasks may be supported, although these may typically be split 
between multiple physical drives. 

While Pentium III and Athlon processors may be able to support software encoding and 
decoding of MPEG 2 streams, for example using the MGI Pure DIVA software package, the 
system preferably employs a hardware codec, such as is available from C-Cube and others. The 
10 use of a hardware codec provides potentially increased quality and reliability, while relieving the 
[fi host processor from burdensome tasks, allowing it to fulfill other functions according to the 
% present invention, such as use profiling, content analysis, digital communications (e.g., IP 
Nl protocol communications on the Internet, web browsing), presentation of advertisements and 
Sj sponsored content, and the like. 

;T5 Alternatively, Apple Power PC, G3 or G4, or IBM Power PC implementation (e.g., 

B RS6000) may be used. Further, the device may be an "embedded" design, employing an Intel 

5 ! J 
Z IB* 

D standard-type environment (e.g., National Semiconductor Geode™ running Windows CE, 

LINUX or BeOS), other embedded processor, such as Intel ARM, embedded Power PC from 
^ IBM and Motorola. See, for example, TiVo Inc. /Philips Personal TV design and Replay 
20 Networks Replay TV designs. 

According to the present invention, especially where automated content analysis is 
required, parallel processors and dedicated digital signal processors, such as the TI 320C6000 
series, may be employed. 

According to the present invention, the interface may perform comparatively simple 
25 tasks, such as standard graphic user interface implementation with optimized presentation of 
screen options, or include more complex functionality, such as pattern recognition, pattern 
matching and complex user preference correlations. Therefore, hardware requirements will 
range from basic Pentium III (or other sixth generation or later Intel-derived designs), Power PC- 
based designs, MIPS, SPARC, ARM, Alpha, or other microprocessors that are used to perform 
30 visual or audio interface functions, to special purpose processors for implementation of complex 
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algorithms, including mathematical, neural network, fuzzy logic, and iterated function systems 
(fractals). 

It should be noted that, while many aspects of the intelligent interface according to the 
present invention do not require extremely high levels of processing power, and therefore may be 
5 provided with inexpensive and commonly available computing hardware, other aspects involve 
complex pattern recognition and advantageously employ powerful processors to achieve a short 
processing latency. Both simple and complex interface systems, however, are included within 
the scope of the present invention. Processing may be distributed in different fashions, so that 
complex functionality may be implemented with relatively simple local hardware, with a 
10 substantial amount of required processing for a high level of functionality performed centrally, 
i]Fj and for a large number of users. 

From the stored information regarding the prior use of the interface by the user, including 
Nl prior sessions and the immediate session, and a current state of the machine (including a received 
Hj data stream and information relating to the data stream previously stored), a predicted course of 
Z 15 action or operation may be realized. This predicted operation is, in the context of the current user 
y interface state, the most probable next action to be taken by the user. 
D The predicted operation is based on: the identity of the user, if more than one user 

£3 operates the interface and machine, the information already entered into the interface during the 
w present programming session, the presently available choices for data entry, settings for the use 
20 of the machine, which may be present as a result of a "setup" operation, settings saved during a 
prior session, and a database of programming choices. In the case of an interface applet script, 
another program may be called that has access to the necessary data in the memory, as well as 
access to any remote database that may be necessary for implementation of the function. Using a 
predictive technology, such as Boolean logic, fuzzy logic, neural network logic, or other type of 
25 artificial intelligence, a most probable choice may be presented to the user for his approval, or 
another alternative choice may be selected. Further, a number of most probable choices may be 
presented simultaneously or in sequence, in order to improve the probability that the user will be 
immediately or quickly presented with an acceptable choice. If multiple choices are presented, 
and there is limited room on the display, two (or more) similar choices may be merged into a 
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single menu selection, which may be resolved in a secondary menu screen, e.g. a submenu or 
dialog box. 

Figure 24 shows a system for correlating a user's preferences with a prospective or real- 
time occurrence of an event. The input device 2401, which is a remote control with a pointing 
device, such as a trackball, provides the user's input to the control 2402. The program is stored 
in a program memory 2403, after it is entered. The control 2402 controls a plant 2404, which is a 
VCR. The control also controls an on-screen programming interface 2405, through which the 
user interactively enters the program information. Each program entry of the user is submitted to 
the user history database and preferences module 2406, which may also receive explicit 
preference information, input by the user through the input device 2401. The prospective and 
real time event characterization unit 2407 uses any and/or all relevant information available in 
order to determine the character of a signal input, which is a video signal, from the signal 
receiver 2408. A signal analyzer 2409 provides a preliminary analysis and characterization of 
the signal, which is input to the prospective and real time event characterization unit 2407, The 
prospective and real time event characterization unit 2407 also interacts and receives an input 
from a telecommunication module 2410, which in turn interacts and receives information from 
an on-line database 241 1. A user preference and event correlator 2412 produces an output 
relating to a relatedness of an event or prospective event and a user preference. In the event of a 
high correlation or relatedness, the control 2402 determines that the event or prospective event is 
a likely or most likely predicted action. The prospective event discussed above refers to a 
scheduled event which is likely to occur in the future. The characterization unit also has a local 
database 2413 for storing schedule information and the like. 

In the particular context of a videotape, one consideration of the user is the amount of 
time remaining on the tape. Generally, users wish to optimally fill a tape without splitting a 
program, although the optimization and non-splitting parameters may vary between users. 
Therefore, the length of the tape and the amount and character of other items on the tape are also 
factors to be employed in determining a most desired result. With respect to this issue, the 
interface may maintain a library function that allows the identification of a partially filled tape 
for recording under given circumstances. The interface may also optimize a playback by 
selecting a tape containing a desired sequence of materials. 



Hoffberg et al. 



- 165 - 



LIH-13 




The intelligent interface may also be used as a part of an educational system, due to its 
ability to adapt to the level of the user and dynamically alter an information presentation based 
on the "user level", i.e. the training status of the user, and its ability to determine areas of high 
and low performance. Likewise, the intelligent interface according to the present invention may 

5 also be used in a business environment for use by trained individuals who require relatively static 
software interface design for consistence and "touch typing" with memorized keystroke or mouse 
click sequences. In this case, the intelligent functionality is segregated into a separate user 
interface structure, such as an additional "pull down menu" or other available screen location. 
While the interface always monitors user performance, the impact of the analysis of the user is 

10 selectively applied. User analysis may also be used for performance evaluation according to an 
objective criteria, based on continuous monitoring. In a network environment, user profile and 
evaluation may be made portable, stored so as to be accessible from any networked device the 
user may interact with, from office computers to thermostats to photocopying machines to coffee 
machines. 



15 



EXAMPLE 10 



INTELLIGENT ADAPTIVE VCR INTERFACE 

In this example, a user interacting with the device intends to record a particular program, 
"Married With Children" (Fox, Sunday, 9:00 p.m., etc.) on each occurrence, and initially 

20 explicitly programs the device accordingly, in the manner of a typical programmable recording 
device. For example, the user may define the program by timeslot and recurrence, by use of an 
electronic program guide, by a keyword search of a program database, or a selective filter for the 
video stream. The system analyzes this intended function, and alters the execution to implement 
a procedure for providing a full library of episodes, and not to duplicate episodes. During first- 

25 run shows, this execution will unlikely differ from the simple explicit program defined by the 
user. During reruns and off-season, however, the system will filter the content to limit 
redundancy. Of course, if the user does not retain a personal archive, there will not be 
redundancy, and the rerun episodes will in that case also be recorded. 

On the other hand, the program may also be subject to the occurrence of reruns, 

30 syndicated distribution, multiple available network affiliates, time shifting of performance, and 
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the like. In that case, assuming the user seeks to create a complete archive, unique episodes of 
the same show will also be recorded from other sources. 

Where the system is operating in a content analysis mode, the system may contingently 
record extraneous information, for example, preview scenes and advertisements. Further, various 
actors appearing in the particular program also appear in other capacities and roles on television. 
Using context information, or available databases, these extraneous segments may be purged. 

Thus, the system provides an increased intelligence over explicitly programmed devices, 
potentially making the device easier to use by intelligently analyzing exceptions and extensions 
for the user. Preferably, the "translated" instructions are presented to the user for confirmation, 
for example by a simple accept/reject indication. If rejected, the system may present alternate 
execution algorithms for review by the user, or execute the user's explicit programming 
definition unmodified. 

Therefore, after the user's intent is elucidated, the interface may scan available directories 
of programming to determine when "Marries With Children" will be broadcast. In addition, to 
the extent possible, all channels may be monitored, in the event that the directories or erroneous 
or incomplete. 

The human user interface system according to the present invention is not limited for 
application video recording devices, and may be quite effective if it is used for a number of 
distinct applications, such as television, radio, desktop computer, and even kitchen appliances 
and heating ventilation air conditioning (HVAC) systems. 

Further, with a degree of portability, the same interface, including user profile 
characteristics, may be used for multiple devices. For example, preferences for processing of 
MTV channel or other music video information may be directly relevant to processing of radio or 
other music reproduction devices, and vice versa. Even more abstract issues, such as screen 
organization, number of presented choices, color selections, alarm indications, and the like, may 
be common across may different devices. 

At some point in the process, preferably prior to substantive programming input, the 
system performs a self-diagnostic check to determine whether the apparatus is set up and 
operating correctly. This would include, for many applications, a determination of whether the 
clock has been set and thereafter operating continuously. Of course, the clock could have, in 
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practice, a battery to minimize the occurrence of problems relating to clock function. The 
interface would then, if the clock is not properly set, and if there is no telecommunication or 
other external means for automatically determining the exact time, present the user with a menu 
selection to set the proper time. Of course, if the correct time is available to the apparatus in 
some form, this could be automatically obtained, and the internal clock updated, without 
intervention. These same sources may be used to verify the accuracy of an internal clock. 
Further, if a reliable external clock system is available, an internal clock may be dispensed with 
or ignored. Time may also be inferred based on the regular schedules of broadcasts, e.g., the 
11:00 p.m. news begins at 11:00 p.m. If the user does not have access to a source of the exact 
time, the step of correcting the time may be deferred, although at some point the user should be 
reminded to verify the clock information. The user may thus be able to override a machine- 
generated request or attempt to correct the time data. 

If the machine has access to an external source of the exact time, it would then preferably 
access this source first. Such sources of exact time include a telephone connection to a voice line 
that repeats the time. The computer would then perform a speech recognition algorithm that 
would be used to determine the time. Such a speech recognition algorithm could also be used as 
a part of the user interface for other purposes, i.e. a speech recognition system is not supplied 
solely for obtaining time information. Alternatively, a modem or communication device could 
be used to obtain the time in digitally coded form over a network, which would alleviate the need 
for speech recognition capabilities for this function. 

A further method for obtaining accurate time information is to access a video signal that 
contains the desired time information. For example, many cable broadcasting systems have a 
channel that continuously broadcasts the time in image form. The interface tunes this channel, 
and acquires a representation of the screen image, thereafter performing a character recognition 
algorithm to capture the time information. This character recognition algorithm could also be 
used to obtain or capture information regarding programming schedules, stock prices, and other 
text information that may appear on certain cable broadcast channels. 

In the case of a video-recording device, the system could also verify the currency of an 
electronic program guide. If this is not current, or for example it appears corrupted, an on-line 
connection could also be used in order to obtain information concerning television scheduling. 
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Preferably, the program guide data is obtained in an out-of band signal (including separate 
channel, VBI transmission, cable modem, satellite data link, etc.) through the same medium as 
used to transmit the video programs. However, use of a distinct communications system, such as 
Internet through a separate physical transport layer, may be used. 

Thus, the interface, in obtaining necessary information, employs such available data 
source access methods as speech recognition, character recognition, digital telecommunication 
means, radio wave reception and interpretation, and links to other devices. 

In a typical interaction session, with the apparatus, the user first identifies himself/herself 
to the machine, which can occur in a number of ways. This step may be dispensed with, or at 
least trivialized, if only one user regularly interacts with the apparatus. Otherwise, such 
identification may be important in order to maintain the integrity of the user profiles and 
predictive aspects of the interface. A radio frequency transponder (RF-ID), infrared transponder 
(IR-ID) system may automatically determine the user based on a devices, which may be 
concealed in a piece of jewelry or wristwatch. The user may also be identified by voice pattern 
recognition, speaker independent voice recognition, video pattern recognition, fingerprint, retinal 
scan, or other biometric evaluation. An explicit entry of the user identity may also be employed, 
wherein the user types his/her name on a keyboard or selects the name or unique identifier from a 
"pick-list". The identity of the user may also be inferred from the time and/or activity performed 
by the user. 

In another embodiment, a normal user of the system need not identify himself; rather, the 
system develops composite profiles of the set of regular users, and infers necessary 
personalization parameters from the nature of the interaction. This scheme, however, may allow 
some inefficiencies to persist until a preferred mode of operation may be determined. 

The interface, upon identifying the user, retrieves information regarding the user, which 
may include past history of use, user preferences, user sophistication, patterns of variation of 
user, which may be based on, e.g., time, mood, weather, lighting, biometric factor or other 
factors. If the user is not uniquely identified, then the initial interaction with the system is used 
to determine a preferred or optimal mode of interaction. 

It is noted that, since in one embodiment of the invention, the system has two discrete 
asynchronous functions; that of programming and using the system, and that of manipulating the 
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media stream, such temporally sensitive variables as user "mood" may have little influence on 
the manipulation of the media stream, since the user interaction with the manipulated media 
stream may occur at an unknown time thereafter. On the other hand, such temporally sensitive 
variables may have a profound influence on the human user interface of the system. 

Thus, after completing system diagnostics, including the time-check function referred to 
above, the system next determines or predicts the desired function of the user. In this regard, if 
more than one user has access to the system, the user is explicitly or implicitly identified to the 
interface, in a user identification step 1701 or an analogous action, which may be a coded entry, 
or a selection from the menu. If the interface has voice recognition capability, then the user may 
be recognized by his voice pattern, or merely by stating his name. The interface then accesses 
the memory for a profile of the past use of the machine by the user, which may include the entire 
prior history, relevant abstracts of the history, or derived user preferences, as shown in the 
personalized startup based on user profile step 1702, which information is also stored and used in 
the past user history determining element 2107. These choices differ in the amount of storage 
necessary in order to retain the desired information. 

Thus, if the user has only used the VCR to record, e.g., the National Broadcasting 
Company (NBC) 11 o'clock news, i.e., record all days from 11:00 p.m. to 11:30 p.m. on NBC, in 
the past, the most likely current predicted choice would be the NBC 11 o'clock news. If the 
interface were to present a number of choices, having lower probability, then it interprets the 
recording history to be "news" based on a database of broadcast information. This 
characterization of the broadcast as "news" may be made in a number of ways; by an explicit 
identification by the user, by extracting the characteristics of the program from an electronic 
program guide, by a content-based analysis of the media stream, or by a correlation of 
characteristics of the past-selected programs with available media streams (without necessarily 
analyzing or determining the content). Therefore, a prediction of lower probability would be 
American Broadcasting Company (ABC) or Central Broadcasting Company (CBS) news at. e.g., 
11:00 p.m., and the NBC news at, e.g., 5:00 p.m. In a cable television system, there may be a 
number of NBC affiliated news alternatives, so that these alternatives may be investigated first 
before other networks or the like are presented as likely choices. In addition, where a video feed 
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is unavailable, a text feed from the Internet or an on-line service may be acquired as a probable 
alternative. 

In addition, the system may define an interest profile of the user, based on explicit or 
implicit selections. Preferably, implicit selections are derived from a semantic analysis of verbal 
media voluntarily reviewed by the user. From this analysis, a set of preferences is derived. 
These preferences are then used to define a filter, operating at a contextual segment level, for 
various media streams, including news feeds, articles, Internet web searches (using standard type 
search technology), broadcasts, and the like. Some broadcasts are divided into published 
segments, so that the beginning and end of a segment may be determined based on a temporal 
scheme. On the other hand, content-based analysis may be required for other broadcasts, which 
may entail analysis of closed-caption text signals, transmitted data or metadata signals, for 
example during the VBI, audio analysis of the broadcast, video analysis of the broadcast, and/or 
a combination thereof. 

For most news or current events broadcasts, the audio and/or semantic information of the 
broadcast may be sufficient for content analysis, and therefore the analysis is simplified as 
compared to a content-based image recognition scheme. On the other hand, for entertainment 
filtering, the image content may be more reliable than semantic communications. For example, 
police drama and action entertainment often display guns, explosions, or other visual themes 
which may be reliably characterized using well developed algorithms. Thus, for example, 
algorithms similar to those used in X-ray security devices to detected firearms in luggage may be 
applied to video data to detect firearms displayed on screen. The audio track of a firearm or 
explosion is also distinctive. By contingently recording a broadcast while monitoring the 
content, it is possible to detect certain characteristics of the broadcast as a whole, and make a 
decision regarding retention after the capture and analysis is complete. Where storage space or 
recording capabilities are limited, a prefiltering algorithm is employed in order to determine 
likely broadcasts which contain the desired characteristics or meet the desired profile, and only 
the most likely programs are recorded. 

Thus, a number of likely choices, based on intelligently determined alternatives, as well 
as adaptation based on determined user preferences, are initially presented to the user, along with 
a menu selection to allow rejection of these predicted choices. In this case, the user selects the 



Hoffberg et al. 



- 171 - 



LIH-13 



• 



"reject" selection, and the system presents the user with a next predicted desired menu choice. 
Since the user history, in this case, does not provide for another choice of particularly high 
probability, the user is prompted to explicitly choose the program sequence by day, time, 
channel, and duration. The user then enters the starting time for recording according to the 
5 methods described above. The interface then searches its databases regarding the user and 
broadcast listings to present a most likely choice given that parameter, as well as all available 
alternatives. In this case, the user history is of little help, and is not useful for making a 
prediction. In other cases, the system uses its intelligence to "fill in the blanks", which could, of 
course, be rejected by the user if these are inaccurate or inappropriate. The most likely choices 
10 are then those programs that begin at the selected time. If the user had input the channel or 

r . network, instead of starting time, then the presented choices would be the broadcast schedule of 

%5 the channel, e.g. channel 5 or Fox, for the selected day. 

yQ The user then selects one of the available choices, which completes the programming 

q sequence. If no database of broadcasts is available, then the user explicitly defines all parameters 
Jf5 of the broadcast. When the programming is completed, the interface then updates its user 
^ database, prompts the user to set the VCR to record, by, e.g., inserting a blank or recordable tape. 
fL! Of course, in the case of a digital video recording device which stored the program on a 

fTj magnetic hard disk or an optical disk, there might be no need to insert a removable storage 

medium. However, through consistent use, the available storage medium is likely to be filled to 



20 capacity. Therefore, an important part of the operation of the device will be archival 

management. This entails purging certain recorded programs and/or transferring certain 
programs to secondary storage. 

In a preferred embodiment, the secondary storage is a VHS videocassette. In this case, 
the controller of the system produces an output suitable for recording on a standard video 

25 cassette recorder. This includes either an NTSC type analog video signal, or a digital signal 
modulated within the NTSC signal space. In the case of a digital signal, preferably multiple 
forms of error detection and correction codes, including interleaving, forward error correction, 
and redundancy, are employed. Further, preferably an index is defined and recorded on the the 
tape. The index includes a description of content and tape offset, and possibly other information, 

30 such as content metadata. Preferably, this index is a digital file or set of files, although an analog 
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signal may be provided, for example with key frames (extracted in known manner) with 
computer readable codes presented in the analog video signal. For example, tape offset may be 
defined as a text signal in the vide frame, computer readable by an optical character recognition 
scheme. A modulated signal may also be provided on the audio tracks. An analog index, for 
5 example, may be human readable, and therefore not require the controlled for playback. 

Preferably, the controller is linked to the secondary storage VCR by standard video and audio 
cables, with an infrared transmitter provided from the controller to the VCR to control VCR 
operation. 

The secondary storage system may also be of another type, for example a magnetic or 
10 optical disk drive or array. 
5; The controller typically determines not only a preferred recording patter of the user, but 

=F ; also a preferred "consumption" or viewing pattern of the user. When it is unlikely that the user 
Si will view a recorded program, for example due to staleness, disinterest, or low priority, it may be 
J~j backed up to secondary storage, or purged. For example, in a daily serial program, if a recorded 
Tf5 segment is not viewed within one week it may be deleted. In the case of news broadcasts, the 

X 

O retention may be 25-100 hours. If, on the other hand, the user seeks to archive a program or 

n I 

B; series without viewing, this may be managed in due course, with only slight delays. Thus, if the 
Ji] user seeks a "Honeymooners" archive, without necessarily watching the episodes regularly, these 
O may be stored directly to secondary media, without requiring the primary storage media 
20 resources for more than a short time, if at all. 

Likewise, in the case of a video library application, such as recording of movies, 
recording may also be directly to a secondary storage medium, with the primary storage medium 
resources not expended for an extended period. 

On the other hand, there is considerable volume of media consumption that is expected to 
25 occur, if at all, within a relatively short time-period from the recording. For these media, 

recording on a fast, convenient, random access media is preferred. For example, a 40 Gbyte hard 
disk drive, such as the Quantum QuickView drive or Seagate Technology Inc. A/V drive, with 
dual access capabilities for typical MPEG 2 data may be appropriate. In this case, the primary 
storage device provides a number of trick play advantages, such as real-time pause, rewind and 
30 fast-forward, variable speed playback, variable quality settings, and the like. Further, content 
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analysis of video streams is preferably conducted from streams stored in the primary storage 
system. 

Another application of the primary store is for storage of the controller operating system 
and application software, required data such as user preference profiles, and user storage, 
allowing the device to perform many of the functions of a typical personal computer, even if in 
an appliance form factor. On the other hand, where the controller system is based on a standard 
computer operating system, such as Microsoft Windows, the primary media storage system is 
preferably a physically distinct device from the primary drive used by the operating system. 
Thus, the data rates and storage characteristics typical of a computer operating system drive will 
differ from those primarily used by an audio/visual recording device. On the other hand, where 
the main processor performs content analysis of the recorded media, preferably this data is 
available to the operating system. In this case, therefore, the content may be redundantly 
recorded to both storage media, with the data stored for content analysis purged immediately 
after processing. It is also noted that the analysis may occur after completion of recording, from 
the audio-visual storage. 

If the predicted desire of the user is of no help, or the user seeks to explicitly program the 
system, a manual program entry system is available. Where there is no useful prediction of the 
user, the interface may request a training session, which may be a general inquiry, or specifically 
directed to immediately forthcoming broadcasts, or both. 

Thus, the system seeks to determine a reliability of a preference determination. Where 
the determined reliability is sufficiently high, then the device may proceed according to the 
inferred user intent and execute accordingly. On the other hand, where the reliability of the 
prediction is low, the system may prompt the user for feedback to ensure that the operation 
corresponds to that desired by the user. In some instances, an ambiguity may be present in a user 
instruction or interaction. In some cases, for example where the possibilities are inconsistent, the 
system must resolve the ambiguity by further interaction with the user. In other instances, the 
system may execute all not-inconsistent interpretations, for later resolution by the user. 

The reliability of the inference may be determined by examining the population of the 
choice space with actual instances of user input and user feedback. Where the choice space has a 
high population density, and the predictions made by the system are generally accepted as 
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accurate by the user, then the system is deemed to have a high reliability for this portion of the 
choice space. On the other hand, where instances in the portion of the choice space are sparse, or 
where the user to some degree disagrees with the predictions made by the system in the portion 
of the choice space, then the reliability may be determined to be low. In the case of low 
reliability, the system first typically seeks to resolve the direct issue, i.e., interpretation of the 
user instruction. If the user is willing, a further set of interactions may then commence to try to 
more fully populate the choice space or define rules or features for the system to apply in the 
future. 

The choice space may be defined by adaptive criteria, for example in the manner of a 
self-organizing neural network, or by predetermined criteria. Preferably, if an electronic program 
guide is available, many criteria are derived either directly or by computation from the types of 
information available in the electronic program guide. An MARS type system or other known 
technique may also be employed. See, "Exploring MARS: An Alternative to Neural Networks", 
PC AI, January/February 2000, pp21-24. 

In this case, after a failure to predict a desired program, the user then proceeds to 
explicitly program the VCR interface to record "Married with Children" on Fox at 9:00 p.m. on 
Sunday evening. If a database is available, it might also show that "Married with Children" is 
also syndicated in re-runs, and therefore various episodes may be available on other channels at 
other times. Thus, during the subsequent session, both the premier showing and re-run of 
"Married With Children" would be available predicted choices, along with the 1 1 o'clock News 
on NBC. 

In a preferred embodiment, the system then seeks to generalize the selection and 
programming entered by the user to extract pertinent characteristics for future predictions by the 
system. Thus, the user having demonstrated a preference for "Married with Children", the 
interface then characterizes the selected program. This includes, for example, a characterization 
of the soundtrack, closed-caption text, the background, foreground, actors and actresses present, 
visual objects, credits, etc. Of course, an electronic program guide listing for this program is also 
analyzed. The interface then attempts to correlate the features present in the reference selection 
with other available selections, i.e., either contingently stored media or upcoming broadcasts. 
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This comparison may be with a preformed database, providing immediate results, or 
prospectively, after entry of the reference selection. Of course, a number of correlation functions 
may proceed simultaneously, and various choices may be merged to form a compound reference 
selection, any ambiguity in which to be later resolved. Further, as various "episodes" of the 
reference selection occur, the system appends and integrates the most recent occurrence with the 
stored reference information, thus updating the reference database. Thus, it is seen that the 
characteristics extracted representing the user selection need not be limited to a single predefined 
program, but in fact may represent a group of programs having one or more common 
characteristics. 

After the reference profile is identified for a preferred type of media, this may be used to 
autonomously operate the system. Thus, when an occurrence corresponding to a user preference 
is identified, it is immediately buffered, until such time as the particular episode may be 
compared against previously stored episodes. If two identical broadcasts occur simultaneously, 
one may be selected, i.e., the one with the best reception. When the episode is identified, if it is 
new, the buffered broadcast information is permanently stored; if it is previously stored, the 
buffer is flushed and the occurrence is further ignored as a "hit". Since the apparatus is now not 
responding to a direct request, it may then perform various housekeeping functions, including 
updating databases of broadcasts and the like. This is because, although the apparatus includes 
default profiles when manufactured, a large number of new broadcasts are always being created 
and presented, so that the apparatus must constantly maintain its "awareness" of data types and 
trends, as well as update its predicted preferences of the user(s). 

The default characteristics may be derived from collaborative filtering, expert 
programming, or other known technique. 

For example, based on input from the user, other programming, including the same actors 
and/or actresses may be processed, e.g., recorded. For example, Katey Segal periodically 
appears on "Jay Leno" as a musical guest, and therefore may be recorded in these appearances. 

The system according to this example, while requiring certain hardware to be present, 
may be implemented as a software program within a relatively standard personal computer (e.g., 
Pentium III 600 MHz or better) system with MPEG 2 video support and video tuning, input and 
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output. Preferably, however, the system includes a hardware MPEG 2 codec and an audio/visual 
hard drive separate from than normally used by the operating system. 

EXAMPLE 11 

INTELLIGENT ADAPTIVE VCR INTERFACE 

Another example of the use of the present programming system allows a hybrid request 
which does not correspond to any single broadcast schedule entry. In this case, if the user 
instead wishes to record weather reports on all channels, the interface may be of further help. 
The interface controls a plurality of tuner elements 2502 of a video signal reception device 2501, 
so that a plurality of broadcasts may be simultaneously received. Using the mass storage and 
possibly image data compression described above, a plurality of broadcasts may also be recorded 
simultaneously in the intermediate storage 2503. The mass storage may be multiple VCRs, 
optical storage, magnetooptical storage, magnetic storage including disk (e.g. single disks, 
multimedia compatible disks, RAID, etc.) tape (QIC, 8mm, 4mm, etc.). Preferably, the archival 
recording medium is recordable DVD or possibly recordable CD-ROM. 

The optical recording tape produced by ICI, Inc., or other card or tape optical storage 
medium might also be a useful storage medium for large volumes of data, as might be generated 
by recording multiple video signals. The known implementations of the ICI product system best 
suited for commercial or industrial use and not for individual consumer use. 

In any case, the interface 2506 accesses its associated database 2413 to determine, at a 
given time, which channels are broadcasting "news". The interface system might also randomly 
or systematically monitor or scan all or a portion of the available broadcasts for "special reports". 
The interface system then monitors these channels for indicia of a "weather" information content 
broadcast. For example, the newscaster who appears to report the weather on a given show is 
usually the same, so that a pattern recognition system 2505 of the video frame could indicate the 
presence of that newscaster. In addition, the satellite photographs, weather radar, computer 
generated weather forecast screens, etc. are often similar for each broadcast. Finally, news 
segments, such as "weather" often appear at the same relative time in the broadcast. Using this 
information, the interface system selects certain broadcast segments for retention. 
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This retention begins at a beginning of a news segment, such as "weather", stop recording 
during commercials, and continues after return from break, on all selected channels. In order to 
assist in making accurate decisions, the monitored broadcasts may be stored in a temporary 
storage medium until a decision is made, and thereafter transfer the recording to a more 
permanent storage medium if that be appropriate. It is noted that the system of the present 
invention is intelligent, and may therefore "learn" either explicitly, or through training by 
example. Therefore, if the system made an error during the process, the user may define the error 
of the system, e.g., a substitute newscaster or rearrangement of news segments, so that the 
interface system has a reduced likelihood of making the same error again. Thus, while such a 
system is inherently complex, it poses significant user advantages. Further, while the interface 
system itself is sophisticated, it provides simplicity, with inductive reasoning and deductive 
reasoning for the user. 

Thus, a minimum of user interaction is required even for complex tasks, and nearly full 
automation is possible, as long as the user and apparatus are able to communicate to convey a 
preference. As a further embodiment according to the present invention, the interface system 
will stored transmitted data, and subsequently review that data, extracting pertinent information. 
The stored data may then be deleted from the storage medium. In this regard, the system may be 
self learning,. 

It is noted that various algorithms and formulae for pattern recognition, correlation, data 
compression, transforms, etc., are known to those skilled in the art, and are available in 
compendiums, such as Netravali, Arun N., and Haskell, Barry G., "Digital Pictures 
Representation and Compression", Plenum Press, New York (1988): Baxes, Gregory A., "Digital 
Signal Processing, A Practical Primer", Prentice-Hall, Englewood Cliffs, NJ. (1984); Gonzalez, 
Rafael C, "Digital Image Processing", Addison-Wesley, Reading, MA (1987), and, of a more 
general nature, Press, William H. et al, "Numerical Recipes in C The Art of Scientific 
Computing", Cambridge University Press, 1988. 
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EXAMPLE 12 

INTELLIGENT ADAPTIVE VCR INTERFACE 

A further example of the use of the advanced intelligent features of the present invention 
is the use of the system to record, e.g., "live" musical performances. These occur on many "talk" 
shows, such as "Tonight Show" (NBC, 11:30 p.m. to 12:30 p.m., weeknights), "Saturday Night 
Live" (NBC 11:30 p.m. to 1:00 a.m. Saturday-Sunday), and other shows or "specials" such as the 
"Grammy Awards". The interface, if requested by the user to record such performances, then 
seeks to determine their occurrence by, e.g., analyzing a broadcast schedule; interacting with the 
on-line database 2411; and by reference to the local database 2413. When the interface 
determines with high probability that a broadcast will occur, it then monitors the channel (s) at 
the indicated time(s), through the plurality of tuners 2502. The system may also autonomously 
scan broadcasts for unexpected occurrences. 

In the case of pay-per-view systems and the like, which incorporate encrypted signals, an 
encryption/decryption unit 2509 is provided for decrypting the transmitted signal for analysis and 
viewing. This unit also preferably allows encryption of material in other modes of operation, 
although known decryption systems without this feature may also be employed with the present 
system. During the monitoring, the interface system acquires the audio and video information 
being broadcast, through the signal receiver 2408, and correlates this information with a known 
profile of a "live musical performance", in the preference and event correlator 2412. This must 
be distinguished from music as a part of, e.g., a soundtrack, as well as "musicals" which are part 
of movies and recorded operas, if these are not desired by the user. Further, music videos may 
also be undesirable. When the correlation is high between the broadcast and a reference profile 
of a "live musical performance", the system selects the broadcast for retention. In this case, the 
information in the intermediate storage 2503 is transferred to the plant 2507, which includes a 
permanent storage device 2508. The intermediate storage 2503 medium is used to record a 
"buffer" segment, so that none of the broadcast is lost while the system determines the nature of 
the broadcast. This, of course, allows an extended period for the determination of the type of 
broadcast, so that, while real-time recognition is preferred, it is not absolutely necessary in order 
to gain the advantages of the present invention. The buffer storage data, if not deleted, also 
allows a user to select a portion for retention that the interface system has rejected. 
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Thus, while it is preferable to make a determination in real time, or at least maintain real 
time throughput with a processing latency, it is possible to make an ex post facto determination 
of the nature of the broadcast program. By using an available delay, e.g., about 5 to about 300 
seconds, or longer, the reliability of the determination can be greatly increased as compared to an 
5 analysis of a few frames of video data, e.g., about 15 to about 300 mS. An intermediate 
reliability will be obtained with a delay of between about 300 to about 5000 mS. As stated 
above, the storage system for this determination need not be uncompressed nor lossless, so long 
as features necessary to determine the character of the broadcast are present. However, it is 
preferred that for broadcast recording intended for later viewing, the storage be as accurate as 
10 possible, so that if a compression algorithm is implemented, it be as lossless as reasonable given 
the various constraints. The MPEG-2 standard would be applicable for this purpose, though 
-yp other video compression systems are available. 

yj In a preferred situation, approximately 5 minutes of broadcast material is analyzed in 

£ order to make a determination of the content. This broadcast material is stored in two media. 
y*5 First, it is stored in a format acceptable for viewing, such as videotape in a videotape recorder, or 
s in digital video format, e.g., compressed in MPEG-2 format. Second, it is received in parallel by 
m the computer control, where the data is subject to a number of recognition and characterization 
!?! processes. These are performed in parallel and in series, to produce a stored extracted feature 
O matrix. This matrix may contain any type of information related to the broadcast material, 
20 including an uncompressed signal, a compressed signal, a highly processed signal relating to 
information contained in particular frames and abstract features, spatially and temporally 
dissociated from the broadcast signal, yet including features included in the broadcast which 
relate to the content of the broadcast. 

One possible method incorporates one or more digital signal processor based coprocessor 
25 elements, which may be present on, e.g., PCI cards in a standard type Intel personal computer or 
Apple Macintosh platform. These elements may be TI TMS320C600X processors, or other 
known devices. In fact, native signal processing support of Intel Pentium III processors is 
sufficient such that one or more parallel processors or parallel networked computers, operating 
under a standard operating system such as Microsoft Windows NT 4.0/2000 or Linux (or other 
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UNIX derived-platform) may provide sufficient processing power to analyze the content. The 
advantage of using a general-purpose host is the volume pricing and ubiquity of such systems. 

A known board containing a DSP is the MacDSP3210 by Spectral Innovations Inc., 
containing an AT&T digital signal processor and an MC68020 CISC processor, and which uses 
the Apple Real-time Operating System Executive (A/ROSE) and Visible Cache Operating 
System (VCOS). It is preferred that the processors employed be optimized for image processing, 
because of their higher throughput in the present image processing applications, to process the 
video signals, and more other signal processors to analyze the audio signals. Of course, general 
purpose processors may be used to perform all calculations. An array processor, which may be 
interfaced with a Macintosh is the Superserver-C available from Pacific Parallel Research Inc., 
incorporating parallel Inmos Transputers. Such an array processor may be suitable for parallel 
analysis of the image segment and classification of its attributes. 

Pattern recognition processing, especially after preprocessing of the data signal by digital 
signal processors and image compression engines, may also be assisted by logical inference 
engines, such as FUTURE (Fuzzy Information Processing Turbo Engine) by The Laboratory for 
International Fuzzy Engineering (LIFE), which incorporates multiple Fuzzy Set Processors 
(FSP), which are single-instruction, multiple data path (SIMD) processors. Using a fuzzy logic 
paradigm, the processing system may provide a best fit output to a set of inputs more efficiently 
than standard computational techniques, and since the presently desired result requires a "best 
guess", rather than a very accurate determination, the present interface is an appropriate 
application of this technology. 

As noted above, these processors may also serve other functions such as voice 
recognition for the interface, or extracting text from video transmissions and interpreting it. The 
continued development of optical computers may also dramatically reduce the cost of 
implementing this aspect of the present invention; however, the present state of the art allows the 
basic functions to be performed. See attached appendix of references, incorporated herein by 
reference, detailing various optical computing designs. 

A real time operating system may be employed, of which there are a number of available 
examples. Real Time JAVA, real timeWindows CE, RTMX, Micro Digital SMX™, real time 
Linux (see, www.rtlinux.orK ), RTX, QNX, HyperKernel, INTime, VxWorks, pSOSystem, see, 
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http://www.faqs.org/faqs/realtime-computing/faq/ , are all examples of operating systems which 
have, to some extent, real-time characteristics. Some older examples include SPOX DSP 
operating system, IBM's Mwave operating system and AT&T's VCOS operating system. These 
operating systems, and possibly others, are to be supported by Microsoft Inc.'s Windows 95 
operating system Resource Manager function. 

It is noted that various methods are available for determining a relatedness of two sets of 
data, such as an image or a representation of an image. These include the determination of 
Hausdorff distance, fuzzy correlation, arithmetic correlation, mean square error, neural network 
"energy" minimization, covariance, cross correlation, and other known methods, which may be 
applied to the raw data or after a transformation process, such as an Affine transformation, a 
Fourier transformation, a wavelet transformation, a Gabor transformation, a warping 
transformation, a color map transformation, and the like. Further, it is emphasized that, in image 
or pattern recognition systems, there is no need that the entire image be correlated or even 
analyzed, nor that any correlation be based on the entirety of that image analyzed. Further, it is 
advantageous to allow redundancy, so that it is not necessary to have unique designations for the 
various aspects of the data to be recognized, nor the patterns to be identified as matching the 
uncharacterized input data. The NDS1000 Development System from Nestor, Inc., provides 
image recognition software which runs on a PC compatible computer and a Data Translation 
DT2878. 

It is noted that many functions of a video recorder might also be facilitated by the use of 
powerful processors. It is also noted that these image recognition functions need not necessarily 
all be executed local to the user, and may in fact be centralized with resultant processed data, or 
portions thereof, transmitted to the remote user. This would be advantageous for two reasons: 
first, the user need not have an entire system of hardware localized in the client device, and 
second, many of the operations which must be performed are common to a number of users, so 
that there is a net efficiency to be gained. In the case of remote execution, non-mainstream PC 
processors and operating systems which provide faster or more complete processing and 
additional features may be desirable. 
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EXAMPLE 13 

INTELLIGENT ADAPTIVE VCR INTERFACE 

The interface of the present invention incorporates an intelligent user interface level 
determination. This function analyzes the quality of the user input, rather than its content. Thus, 
this differs from the normal interface user level determination that requires an explicit entry of 
the desired user level, which is maintained throughout the interface until explicitly changed. The 
present interface may incorporate the "smart screen" feature discussed above, which may, 
through its analysis of the past user interaction with the interface predict the most likely 
predicted user input function. Thus, the predictive aspects of the present invention may be 
considered a related concept to the intelligent user level interface of the present invention. 
However, the following better serves to define this aspect of the invention. 

The input device, in addition to defining a desired command, also provides certain 
information about the user which has heretofore been generally ignored or intentionally removed. 
With respect to a two-dimensional input device, such as a mouse, trackball, joystick, etc., this 
information includes a velocity component, an efficiency of input, an accuracy of input, an 
interruption of input, and a high frequency component of input. This system is shown 
schematically in Fig. 21, which has a speed detector 2104, a path optimization detector 2105, a 
selection quality detector 2106, a current programming status 2108, an error counter 2109, a 
cancel counter 21 10, a high frequency signal component detector 2112, an accuracy detector 
2113 and a physio-dynamic optimization detector 2114. In addition, Fig. 21 also shows that the 
interface also uses a past user history 2107, an explicit user level choice 2111 and an explicit 
help request 2115. 

This list is not exclusive, and is somewhat dependent on the characteristics of the specific 
input device. For a mouse, trackball, or other like device, the velocity or speed component refers 
to the speed of movement of the sensing element, i.e. the rotating ball. This may also be 
direction sensitive, i.e., velocity vector. It is inferred that, all other things being equal, the higher 
the velocity, the more likely that the user "knows" what he is doing. 

The efficiency of input refers to two aspects of the user interface. First, it refers to the 
selection of that choice which most simply leads to the selection of the desired selection. For 
example, if "noon" is an available choice along with direct entry of numbers, then the selection 
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of "noon" instead of "12:00 p.m." would be more efficient. The second aspect of efficiency has 
to do with the path taken by the user in moving a graphic user interface cursor or input device 
from a current position to a desired position. For example, a random curve or swiggle between 
locations is less efficient than a straight line. This effect is limited, and must be analyzed in 
conjunction with the amount of time it takes to move from one location of a cursor on the screen 
to another; if the speed of movement is very rapid, i.e. less than about 400 mS for a full screen 
length movement, or less than about 300 mS for small movements, then an inefficiency in path is 
likely due to the momentum of the mouse and hand, momentum of the rolling ball, or a 
physiological arc of a joint. This aspect is detected by the physio-dynamic optimization detector 
2114. Thus, only if the movement is slow, deliberate, and inefficient, should this factor weigh 
heavily. It is noted that arcs of movement, as well as uncritical damping of movement around 
the terminal position may be more efficient, and a straight path actually inefficient, so that the 
interface may therefore calculate efficiency based on a complex determination, and act 
accordingly where indicated. 

Thus, an "efficient" movement would indicate a user who may work at a high level, and 
conversely, an inefficient movement would indicate a user who should be presented with simpler 
choices. The efficiency of movement is distinguished from gestures and path dependent inputs, 
such as drawing and painting. These may be distinguished based on machine status or context. 
Further, the interface may recognize gestures in may contexts. Therefore, gestures or 
gesticulations must be distinguished from direct command inputs before further processing. 
Gestures or gesticulations, like path efficiency, may also be analyzed separately from the basic 
command input, and therefore may be provided as a separate input stream on an interface level 
rather than an application level, thus allowing cross application operation. 

Likewise, if a movement is abrupt or interrupted, yet follows an efficient path, this would 
indicate a probable need for a lower user interface level. This would be detected in a number of 
elements shown in Fig. 21, the speed detector 2104, a high frequency signal component detector 
2112, an accuracy detector 2113 and a physio-dynamic optimization detector 2114. In addition, 
Fig. 21 also shows the use of a past user history 2107, an explicit user level choice 211 1 and an 
explicit help request 2115. 
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While the interface may incorporate screen buttons that are smart, i.e. those that 
intelligently resolve ambiguous end locations, the accuracy of the endpoint is another factor in 
determining the probable level of the user. Thus, for example, if a 14" color monitor screen is 
used, having a resolution of 640 by 480 pixels, an accurate endpoint location might be 
considered within a central area of a displayed screen button of size about 0.3" by about 1.0", for 
example within an area of about 0.25" by about 0.75". A cursor location outside this location, 
but inside the screen button confines would indicate an average user, while a cursor location 
outside the screen button may be inferred to indicate the button, with an indication that the user 
is less experienced in using the pointing device. These are not necessary conclusions, for 
example a skilled user may efficiently point to an edge of an active area on the screen, while a 
novice user may slowly and deliberately point to a precise center location; therefore, evaluation 
of a number of characteristics may be helpful in inferring user skill level or other types of 
characteristics. 

Finally, in addition to the efficiency of the path of the cursor pointing device, a high 
frequency component may be extracted from the pointer signal by the high frequency signal 
component detector 2112, which would indicate a physical infirmity of the user (tremor), a 
distraction in using the interface, indecision in use, or environmental disturbance such as 
vibration. In this case, the presence of a large amount of high frequency signal indicates that, at 
least, the cursor movement is likely to be inaccurate, and possibly that the user desires a lower 
user level. While this is ambiguous based on the high frequency signal content alone, in 
conjunction with the other indicia, it may be interpreted. If, for example, the jitter is due to 
environmental vibrations, and the user is actually a high level user, then the response of the user 
level adjust system would be to provide a screen display with a lowered required accuracy of 
cursor placement, without necessarily qualitatively reducing the implied user level of the 
presented choices, thus, it would have an impact on the display simplification 2103, with only 
the necessary changes in the current user level 2101. 

Alternatively, the user may input a gesture, i.e., a stylized input having no other 
command input meaning, which may be detected by analyzing the input. The input may be a 
manual input, voice input, image (e.g., sketch, video image capture, image exemplar) or the like. 
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A number of different gestures may be recognized. These gestures are generally explicit inputs, 
which allow a voluntary action to be interpreted as input information to the interface. 

EXAMPLE 14 

INTELLIGENT TELEPHONE DEVICE INTERFACE 

Likewise, the present interface could be used to control complex telecommunications 
functions of advanced telephone and telecommunications equipment. In such a case, the user 
display interface would be a video display, or a flat panel display, such as an LCD display. The 
interface would hierarchically present the available choices to the user, based on a probability of 
selection by the user. The input device would be, for example, a small track ball near the 
keypad. Thus, simple telephone dialing would not be substantially impeded, while complex 
functions, such as call diversion, automated teledictation control, complex conferencing, caller 
identification-database interaction, and videotel systems, could easily be performed. 

The present invention allows complete integration of telephony operations, including 
voice over IP (VOIP), video conferencing, call center functions, telephone answering/voice 
mail/automated attendant functions, and the like. The controller may also provide such functions 
as least-cost routing calculations and the like. 

Preferably, the interface according to the present invention provides an adaptive interface 
for use of the system, which customizes the information presented to the user and the information 
elicited from the user based on a user characterization of profile, the context of use, and possibly 
the past history of use by that user or a group of users. The use of past history is optional, since 
salient user characteristics are present in the user profile, also based on past history, but generally 
at a higher level of abstraction. It is also possible to employ the past history alone, without 
abstracting the information to generate a user profile. In some cases, the relevant information for 
a user profile will be largely distinguished from the relevant information for a user past history of 
use, since the user profile is intended to be largely generalizable characteristics, while the past 
history may be intended to be largely specific examples of use. 
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EXAMPLE 15 

AUDIO RECORDING MANAGEMENT SYSTEM 

The present invention is also suitable for use as a system, method and/or user interface for 
audio files, for example in a jukebox or background music arrangement. The audio filed maybe 
provided by wireless communications (e.g., FM radio, satellite, cellular techniques, TV band 
subcariers, etc.), wired communications (e.g., telephone, Internet, DSL, TL etc.), physical 
storage media (e.g., musical compact disks), etc. The preferred system provides a user 
preference based "filter", allowing the user to personalize the listening experience. In the event o 
a background music application, instead of a personal preference, a collaborative filtering 
technique is applied, to determine a group preference. The technology may encompasses a 
number of different methods of filtering, including musical style, artist, popularity, semantic 
content, play history, or the like. See, Music, Mind, Machine, Computational Modeling of 
Temporal Structure in Musical Knowledge and Music Cognition, [Unpublished manuscript, 
August 1995, Peter Desain & Henkjan Honing], http://www.douglas.bc.ca/-landonb/360/DH-95- 
C.HTML , expressly incorporated herein by reference. 

The basis for characterizing the audio may include in band signals and content analysis, 
out of band signals, electronic program guides and associated data records, and explicit user 
characterization. A preferred system employs a standard North American FM broadcast system 
in which a metadata stream is encoded within the audio channel, similar to the Secure Digital 
Music Initiative (SDMI) technique, for example employing the audio watermarking technology 
of Verance Corp (Aris Corp. and Solana Corp.), or Arbitron. This metadata provides a digital 
data stream which provides identification and preferably characteristics of the song. This 
information is decoded at the receiver, and an intelligent decision may then be made concerning 
the associated content, for example, record, play live, or purge/ignore. Preferably, a mass storage 
system is provided to buffer content, at least until a decision is made, and preferably for long 
term storage. Thus, the broadcaster need not redundantly broadcast content, as it can be repeated 
from local storage. On the other hand, such a system may scan multiple channels, to define a 
custom play list. 

The content may also be derived from an on-line source, for example an MP3 (MPEG 
Audio Level 3 encoding) file, and downloaded and stored in this format. In the case of an 
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Internet download, the metadata need not be encoded within the audio signal, and may be 
provided as a separate data file, or even from a separate source entirely 

In a broadcast system, each broadcast "segment" is preferably encoded with an identifier, 
which can then be interpreted using a local database at the receiver. Preferably, the broadcast is 
encoded with a full set of characteristics, so that a complete local database is not required, thus 
preserving storage capability for content rather than metadata. 

In a preferred embodiment, as an integral part of the design, means are provided for 
respecting the interests of broadcasters, commercial sponsors, and copyright holders. In other 
words, a general or specific accounting is made for use o media. In theory, the costs to the 
consumer need not be greater than at present, and, in fact, with efficiencies, may actually reduce 
costs. Thus, where listener presently has no costs for use of broadcast radio, costs with the 
present system will likely also be without direct user cost. Instead, an accounting system is 
provided for distributing costs and revenues among the broadcaster, sponsor, and service 
provider. Media stored in a receiving device may be encrypted, to assure compliance with 
licensor-imposed restrictions. In order to promote user compliance with the system, incentives 
may be provided to the user to cooperate with data gathering. Such incentives can, but need not 
be monetary. The system may also provide demographically targeted advertising. Thus, instead 
of directly playing commercials inserted by a broadcaster, a set of commercials or advertisements 
may be presented to the user aligned with the user's tastes, preferences, and value to sponsors. A 
user may also eliminate or defer all advertisements, at some cost. Therefore, the accounting 
system seeks to attribute costs and revenues based on source, recipient and contracted 
sponsorship. According to this model, each targeted listener is presumably more valuable to a 
sponsor than an unselected listener. Thus, a listener may be burdened with fewer commercials. 
Due to time-shifting, broadcasters will be able to achieve higher valuation for off-hours 
broadcasts. Sponsors which appropriately target advertisements will see lower advertising costs 
and higher response rates. 

The preferred design takes the form of an audiophile, automotive or personal radio 
device, likely integrated with an MP3 codec and large hard disk drive, for example 20-40 Gbytes. 

The service provider may be compensated by the user, in the form of a fixed of variable 
service charge, the broadcasters or the sponsors. Typically, the user will have a relationship with 
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the provider, due to privacy concerns. Thus, the provider also serves as an aggregator and portal, 
filtering user identity from the broadcasters and sponsors. Some premium broadcasts may be 
encrypted, with users accounting for a fee for decryption, which may have any appropriate rules, 
such as complete decryption, play once, play for a limited period, copy once, etc. 
5 As stated above, commercials may be stored on the user's local system and played back. 

These commercials may be subject to personalized filtering as well, and therefore the per- 
impression ad rates may exceed the normal ad rates. This will lead to increased advertising 
revenues for the broadcasters, which may be shared with, the licensors. Using, for example, using 
the Internet as an uplink channel, auditing and verification techniques may be employed. In this 
10 case, the device preferably has an internal modem or USB port. For audiophile or integrated 
c video-audio devices, an IEEE-1393 port may be preferred. Depending on the implemented 
^; privacy policy, which may vary between users, marketable personal profile and demographic 
*B information may be generated and exploited. 

fj An intelligent radio system provides substantial advantages over simple Internet 

ij5 downloads of MP3 files, which are quite popular. The technology is fundamentally a "push" or 
L, broadcast technology, using relatively cheap bandwidth. Real-time delivery is assured. Using a 
fU combination of time shifting and multiple broadcast channels, a wide variety of source material 
ni will be available periodically, negating the need for large local memory at the client system. 

JCSS. 

y 

20 EXAMPLE 16 

CHARACTER RECOGNITION OF VIDEO 

The present invention may incorporate character recognition from the video broadcast for 
automatic entry of this information. This is shown schematically in Fig. 24, with the inclusion of 
the videotext and character recognition module 2414. This information is shown to be 

25 transmitted to the event characterization unit 2407, where the detected information is correlated 
with the other available information. This information may also be returned to the control 2402. 
Examples of the types of information that would be recognized are titles of shows, cast and crew 
from programming material, broadcast special alerts, time (from digital display on special access 
channels), stock prices from "ticker tape" on special access channels, etc. Thus, this technology 

30 adds functionality to the interface. In addition, subtitled presentations could be recognized and 
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presented through a voice synthesizer, to avoid the necessity of reading the subtitle. Further, 
foreign language subtitles could be translated into, e.g., English, and presented. In a particular 
embodiment, certain game shows, such as "Wheel of Fortune" have alphanumeric data presented 
as a part of the programming. This alphanumeric text may be extracted from the image. 

In a preferred embodiment, the character recognition is performed in known manner on a 
buffer memory containing a frame of video, from a device such as a Data Translation DT2851, 
DT2853, DT2855, DT2867, DT2861, DT2862 and DT2871. A contrast algorithm, run on, for 
example, a Data Translation DT2858, DT2868, or DT2878, first removes the background, 
leaving the characters. This works especially well where the characters are of a single color, e.g. 
white, so that all other colors are masked. After the "layer" containing the information to be 
recognized is masked, an algorithm similar to that used for optical character recognition (OCR) 
is employed. See, U.S. 5,262,860, incorporated herein by reference. These methods are well 
known in the art. This may be specially tuned to the resolution of the video device, e.g. NTSC, 
Super Video Home System (S-VHS), High Definition Television and/or Advannced Television 
System Committee (HDTV/ATSC-various included formats), Improved definition television 
(IDTV), Enhanced Definition Television (EDTV), Multiple Sideband Encoding (MUSE), Phase 
Alternate Line (PAL), Sequential Coleur a Memoire (SECAM), MPEG-2 digital video, or other 
analog or digital transmission and/or storage formats, etc. In addition, since the text normally 
lasts for a period in excess of one frame, a spatial-temporal image enhancement algorithm may 
be employed to improve the quality of the information to be recognized, if it is indistinct in a 
single frame. 

EXAMPLE 17 

SMART HOUSE INTERFACE 

The present invention may also be incorporated into other types of programmable 
controls, for example those necessary or otherwise used in the control of a smart house. See, 
"The Smart House: Human Factors in Home Automation", Human Factors in Practice, Dec. 
1990, 1-36. The user interface in such a system is very important, because it must present the 
relevant data to the user for programming the control to perform the desired function. A smart 
house would likely have many rarely used functions, so that both the data and the available 
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program options must be presented in the simplest manner consistent with the goal of allowing 
the user to make the desired program choice. For example, a smart house system with 
appropriate sensors might be used to execute the program: "start dishwasher, if more than half 
full, at 9:00 p.m." This program might also include a program to load soap into the dishwasher 
5 or to check if soap is already loaded. A user who wishes to delay starting until 1 1 :00 p.m. would 
be initially presented with the defaults, including start time as an option, which would be simply 
modified by correcting the starting time. The next time the same user wishes to program the 
device, an algorithm might change the predicted starting time to, e.g. 10:00 p.m., which is a 
compromise between the historical choices. Alternatively, the new predicted start time might be 
10 11:00 p.m., the last actually programmed sequence. Finally, the next predicted start time might 
Pii remain at 9:00 p.m. The resolution of these choices would depend on a number of factors: a 
% preprogrammed expert system; any other prior history of the user, even with respect to other 
© appliances or in other situations; the context, meaning any other contemporaneously programmed 

"V: 

p sequences; and an explicit input from the user as to how the inputs should be evaluated for 
tfS predictive purposes. 

The expert system may balance many factors, including disturbing noise from the 
ny dishwasher, which might be objectionable while persons are near the dishwasher, people are 

I : 

n I sleeping, or during formal entertainment nearby. On the other hand, if the dishwasher is full, or 
~? its cleaned contents are needed, the dishwasher should run with higher priority. Some persons 

20 prefer to reshelve dishes in the evening, before sleep, so in those cases, the dishwasher should 
complete its cycle before bedtime. The dishwasher, on a hot water cycle, should not run during 
showers or baths, and preferably should not compete with a clothes washer for hot water. This 
may be sensed by direct communication with other systems, or by sensing pressure or vibration 
in the water feed lines. The dishwasher preferably does not run during peak electrical demand 

25 times, especially if electrical rates are higher. Water conserving cycles should be selected, 
especially during droughts or water emergencies. If dishes remain in the dishwasher for an 
extended period, e.g., overnight, a moistening cycle may be employed to help loosen dirt and to 
help prevent drying. On the other hand, a fast cycle may also be provided where desired. Thus, 
the expert system is preprogrammed for a number of high-level considerations that might be 

30 common to a large number of users of the system, thus shortening the required training time of 
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the system to learn the preferences of the user. Such a sophisticated system may eliminate the 
need entirely for adaptive responses, based on weighing of considerations provided by the user. 
Of course, other considerations may also be included for the operation or delay of operation of 
the dishwasher. Further, these considerations are exemplary of the types of considerations which 
might be employed in an expert system in a smart house. 

The prior history of the user provides an excellent source of information regarding the 
preferences of the user, although this is sometimes not the most efficient means, and may often 
include contradictory data. This historical use data is therefore analyzed in a broad context in 
order to extract trends, which over a number of uses may be further extracted as "rules". Often, 
the user history data will be applied at a high level, and will interact with preexisting rules of the 
expert system, rather than to create new rules. In this case, the expert system preferably includes 
a large number of "extra rules", i.e., those with an a priori low probability or low weighing, 
providing a template for future pattern matching. The past history may be evaluated in a number 
of ways. First, an expert system may be used to analyze the past usage pattern. Second, a neural 
network may be trained using the historical data along with any corrective feedback. Third, the 
historical data may be used to alter fuzzy logic rules or classifications, either by expert system, 
neural network, or by other known means. Thus, as stated above, the user profile, while 
potentially related to history of use, may include distinct information, such as explicit entry of 
user preferences and path dependent characteristics normally filtered from a stored past history. 

The context of use may also be used to determine a desired or predicted action. 
Therefore, if on a single occasion, a number of changes are made, for example during a large 
house party, the standard predictions would not be altered, and thus a normal program would 
remain in effect. Of course, a new "house party" sequence would then be recognized and 
included as a new type of sequence for future evaluation. For example, a house party sequence 
might encompass a number of house systems. Thus, the delay of dishwasher until 1 1 :00 p.m. 
allows all dishes from the party to be placed in the dishwasher before starting. An alarm system 
would be generally deactivated, although various zones may be provided with different 
protection; e.g., a master suite may be off-limits, with an alarm transmitting a signal to a user's 
beeper, rather than a call to police or alarm service company. During the summer, the air 
conditioner might run even if doors and windows are open, even if the normal program prompts 
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for door closings before the air conditioner is turned on. Likewise, exterior lighting would be 
turned on at dusk, with bug lights turned on during the entire party. The user might individually 
make such decisions, which would be recognized as a group due to their proximity in time, or 
delineate the actions as a group. Thereafter, where some of these choices are made, and the 
profile of choices matches a "party" style, the remainder of the choices may be presented as a 
most likely or predicted choice. The group of choices together might also be selected from a 
menu of choices. Appropriate sensors may be provided for each system, or for the house as a 
whole, to detect the relevant conditions. Preferably, sets of conditions may be determined based 
on a population statistic, i.e., collected from a variety of sources, and stored centrally in a library. 
The system may then communicate with the library, for example through the Internet, to search 
for a resource in the library which matches detected or anticipated conditions. If such a resource 
is identified, it is identified, and processed according to local variations, which may include local 
hardware configurations, user preferences, and the like, and then checked for consistency. If 
consistent, this modified resource may then be executed, providing an adaptive control 
methodology. If inconsistent, another resource may be selected, or the user may be involved in 
correcting the issues identified. 

Context also relates to sensor data, which might include sensors in particular appliances 
or unrelated sensors. For example, video, audio, ultrasonic, radar, lidar, and/or infrared motion 
detectors may be used to estimate the number of persons present in a house. Likewise, heavy use 
of a bathroom, as detected by plumbing sensors, flushes, frequent light transitions or door 
openings, might also be useful as data to estimate a crowd size. Temperature sensors, video 
imaging sensors, perimeter sensors, electrical sensors relating to the status of appliances and 
machinery, and other types of sensors may provide data for context determination. 

Of course, explicit inputs must also be accommodated, which may be atomic instructions 
or complex combinations of instructions which may control a single house system or a number of 
house systems simultaneously. The explicit input preferably comes by way of the adaptive 
interface described throughout the present application, or an interface incorporating particular 
aspects thereof. 

The smart house system also controls the climate control system. Thus, it could 
coordinate temperatures, airflow and other factors, based on learned complex behaviors, such as 
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individual movement within the dwelling. Since the goal of the programming of the smart house 
is not based on the storage of discrete information, but rather the execution of control sequences 
at various times and under certain circumstances, the control would differ in various ways from 
that of a consumer entertainment management device. However, the user interface system, 
5 adaptive user level, help system, and the like might share substantial similarities. 

It is noted that a common user interface system may be provided for multiple systems, for 
example communicating through a network, which may be wired, wireless or communicate 
through power lines or light waves, thus allowing for the consumer entertainment management 
device and other devices within a smart house to share hardware and software resources, even if 
10 these devices have different essential control systems, so that the common elements are not 
^ redundant. Therefore, by applying a single control to many tasks, a common user interface is 
^C; used, and the cost is reduced. 

i^i EXAMPLE 18 

PROGRAMMABLE ENVIRONMENTAL CONTROLLER 

- The present Example relates to a programmable environmental controller application. In 

t- .i 

ry this case, a sensor or sensor array is arranged to detect a change in the environment that is related 
f[i to a climatic condition, such as an open door. On the occurrence of the door opening, the system 
y would apply a pattern recognition analysis to recognize this particular sensor pattern, i.e. a mass 
20 of air at a different temperature entering the environment from a single location, or a loss of 
climate controlled air to a single location. These sensor patterns must be distinguished from 
other events, such as the action of appliances, movement of individuals in the vicinity of the 
sensor, a shower and other such events. It is noted that in this instance, a neural network based 
adaptive controller may be more efficient than a standard fuzzy logic system, because the 
25 installation and design of such a system is custom, and therefore it would be difficult to program 
fuzzy set associations a priori. In this case, a learning system, such as a neural network, may be 
more efficient in operation and produce a better result than other adaptive methods. The training 
procedure may be fully automated, (with manual feedback provided where necessary to adjust 
the control parameters) so long as sufficient sensors are provided for controlling the system, and 
30 also that an initial presumption of the control strategy is workable during the training period. In 
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the case of an HVAC system, the initial strategy incorporated is the prior art "bang-bang" 
controller, which operates as a simple thermostat, or multi-zone thermostat. As a better starting 
point, a fuzzy logic temperature controller may be modeled and employed. Other known 
strategies that are not often used in environmental control include the proportional-integral- 
differential controller (PID). The present control is preferably model based or MARS, applying 
direct knowledge of the control task and characteristics of the system to the control issue. 
Likewise, in HVAC systems, cost and operational efficiency are often a paramount concerns, and 
the control preferably is responsive to sensors for energy consumption and/or efficiency. 

It is noted that the HVAC system may also be of a type that is inoperable with standard 
type controllers; for example, the system may be such as to produce temperature oscillations, or 
significant temperature or pressure gradients. In this case, the default control system must be 
provided to compensate the system, allowing more subtle corrections and adjustments to be 
made based on preferences. Thus, an expert system is provided, which is updated based on user 
input, and which receives context information, including sensor data and other inputs. Explicit 
user preferences and programming are also input, preferably with an interface in accordance with 
the present invention or incorporating aspects thereof. 

In this example, which may be described with reference to Fig. 23, sufficient sensors in a 
sensor array 2301 are provided, being light, temperature, humidity, pressure, air flow and 
possibly a sensor for determining an event proximate to the sensor, such as door opening. While 
a single sensor array 2301 may provide input, a plurality of sensor arrays are preferably 
employed in complex installations. The sensors, with the possible exceptions of the flow sensor 
and event sensor, may be housed in a single sensor head. Further, the temperature and pressure 
sensors may be combined in a single integrated circuit by known means. The light and 
temperature sensors are known to those skilled in the art, and need not be described herein. The 
pressure sensor may be a Sensym strain gage pressure transducer, a Motorola pressure transducer 
device, or the like, which are known in the art. Alternatively, other types of sensors may be 
used, for example a micromachined silicon force balance pressure transducer, similar in electrical 
design to the Analog Devices monolithic accelerometers, ADXL-50 or ADXL-05. 

The humidity sensor is preferably an electronic type, producing an electrical signal 
output. It need not be internally compensated for the other measured environmental factors, as 
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the constellation of sensors may compensate each other. The air flow sensor may be based on 
pressure differentials, using the electronic pressure sensor described above, or may be a 
mechanical vane type, which is based on flows. In most applications, a single flow axis will be 
sufficient, however, in some circumstances, a two or greater axis sensor will be required. 
5 Further, in the case of large volume areas, complex turbulent flow patterns may be relevant, for 
which known sensors exist. Laser based air flow sensors may be employed, if desired. LIDAR 
sensors may be used to determine flow rate, direction, and turbulence. 

The event sensor may be of any type, and depends particularly on the event being 
measured. In the present case, where a door opening is to be detected, it is preferred that the 
10 environmental control be interfaced with a perimeter intrusion alarm system, which, for example, 
provides a magnet embedded in the door and a magnetic reed switch in the door frame. 

w 

. fh 

Individual sensors are normally wired to the alarm control panel, thus providing central access to 
yS many or all of the desired event detection sensors while minimizing the added cost. The event 
p detector may also be an ultrasonic, infrared, microwave-Doppler, mechanical, or other type of 
|J5 sensor. Wireless sensors may also be used, communicating via infrared beams, acoustic, radio 
L. frequency, e.g., 46-49 MHz, 900 MHz, 2.4 GHz, 5.2-5.8 GHz, or other bands, using analog, 
ffj digital or multilevel quantized digital AM, FM, PSK, QAM, or other modulation scheme, and/or 
n\ spread spectrum techniques (frequency hopping and/or direct sequence spread spectrum) or a 
p combination thereof. Spread spectrum devices may be employed, as well as time, code or 
20 frequency multiplexing or a combination thereof. Various failsafe mechanisms are preferably 
included, including those identifying transmitter or receiver failure, communication interference 
or message collision, and other conditions. A reverse communication channel may also be 
included, either symmetric in band, or asymmetric in band or out of band, for communication 
with the sensor or apparatus associated with the sensor, and as part of the failsafe system. A 
25 forward error correction protocol is preferably effected, which may detect errors and include 
error correcting codes for digital transmissions. Digital data may be encrypted, and the 
transmission modulation scheme may also include an encrypted sequence of frequency, phase, 
convolution, noise, or other modulation parameter. 

While wireless data transmission as described above may be used, the preferred method 
30 of receiving sensor information is through a serial digital or analog (i.e., 4-20 mA transmitter) 
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data transmission which may be multiplexed and/or part of a local area network scheme, with 
minimal local processing of the sensor data by the microprocessor 2302 with the serial link 
2302a in the sensor head. Such serial digital protocols and physical transport layers include 
Echelon LON-works, BSR X-10, CEBUS, RS-232, RS-423, Apple ADB, Appletalk, Ethernet 
5 (10 base T, 10 Base 2, 10 base 5, 100 Base T, 100 base VG), ATM, USB, IEEE-1394, Homerun 
(Intel/Tut), etc. This system allows the central control 2303 to incorporate the desired 
processing, e.g., by the pattern recognition system 2304, etc., while minimizing the installation 
expense. A simple microprocessor device 2302 in the sensor head interfaces the sensing 
elements, and may provide analog-to-digital conversion, or other conversion which may be 
10 necessary, of the sensor signal. In the case of a serial digital data transmission, the local 
p microprocessor formats the sensor data, including a code indicating the sensor serial number and 
pi type, the sensor status (i.e., operative, defective, in need of maintenance or calibration, etc.), the 
J^j sensor data, and an error correcting code. In the case that the data is transmitted on a local area 
Q network, the microprocessor also arbitrates for bus usage and the messaging protocol. 
l!5 The control, it must be understood, has a number of available operative systems at its 

disposal, comprising the plant 2306. In this case, the system is a forced air heating and cooling 
fy system. This system has a heating unit, a humidifier, blowers, a cooling unit (which also 
n\ dehumidifies), ducts, dampers, and possible control over various elements, such as automated 
p] door openers. 

20 As described above, the system is installed with a complete array of sensors, some of 

which may be shared with, or a part of, other control systems in the environment, and begins 
operation with a basic acceptable initial control protocol. The system then receives data from the 
sensors, and correlates data from the various sensors, including the event sensors, with the 
operation of the systems being controlled. In such a case, a "door open" event may be correlated 

25 with a change in other measured variables. The system then correlates the control status with the 
effect on the interrelation of the measured variables. Thus, the system would detect that if the 
blower is operating while the door is open, then there is a high correlation that air will flow out 
of the door, unless a blower operates to recirculate air from a return near the door. Thus, the 
system will learn to operate the proximate return device while the door is open and the blower is 

30 on. Once this correlation is defined, the system may further interrelate the variables, such as a 
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wind speed and direction outside the door, effects of other events such as other open doors, the 
absolute and relative speeds of the blowers and the return device, the effect of various damper 
devices, etc. It is further noted that, under some circumstances, an exchange of air through an 
open door is desired, and in such instance, the system may operate to facilitate the flow through 
such an open door. Finally, the system must be able to "learn" that conditions may exist which 
produce similar sensor patterns which should be handled differently. An example is a broken, 
defective or inoperative sensor. In such a case, the system must be able to distinguish the type of 
condition, and not execute an aggressive control algorithm in an attempt to compensate for an 
erroneous reading or otherwise normal event. For this purpose the intelligent control of the 
present invention is advantageous. In order to distinguish various events, sensors that provide 
overlapping or redundant information, as well as providing a full contextual overview, should be 
provided as a part of the system. 

It is further noted that energy efficiency is a critical issue in climate control systems, and 
an absolute and continuous control over the internal environment may be very inefficient. Thus, 
the starting of large electrical motors may cause a large power draw, and simultaneous starting of 
such equipment may increase the peak power draw of a facility, causing a possible increase in the 
utility rates. Further, some facilities may operate on emergency or private power generation (co- 
generation) which may have different characteristics and efficiency criteria. These factors may 
all be considered in the intelligent control. It is also noted that a higher efficiency may also be 
achieved, in certain circumstances, by employing auxiliary elements of the climate control 
system which have a lower capacity and lower operating costs than the main elements. Thus, for 
example, if one side of a building is heated by the sun, it may be more efficient to employ an 
auxiliary device which suitably affects, i.e. compensates, only a part of the building. If such 
equipment is installed, the aggregate efficiency of the system may be improved, even if the 
individual efficiency of an element is lower. Likewise, it may be preferable to run a 2-1/2 ton air 
conditioning unit continuously, rather than a 5 ton air conditioning unit intermittently. The 
present intelligent control allows a fine degree of control, making use of all available control 
elements, in an adaptive and intelligent manner. 

Returning to the situation of a door opening event, the system would take appropriate 
action, including: interruption of normal climate control until after the disturbance has subsided 
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and normal conditions are achieved; based on the actual climatic conditions or predicted climatic 
conditions begin a climate compensation control, designed to maximize efficiency and also 
maintain climatic conditions during the disturbance, as well as return to normal after the 
disturbance; optionally, during the door opening disturbance, the system would control a 
pressure or flow of air to counterbalance a flow through the door, by using a fan, blower or other 
device, or halting such a device, if necessary. It is also noted that the climatic control system 
could also be outfitted with actuators for opening and closing doors and windows, or an interface 
with such other system, so that it could take direct action to correct the disturbance, e.g., by 
closing the door. The climate between the internal and external ambients may differ in 
temperature, humidity, pollutants, or the like, and appropriate sensors may be employed. 

It is thus realized that the concepts of using all available resources to control an event, as 
well as using a predictive algorithm in order to determine a best course of action and a desired 
correction are a part of the present invention. 

EXAMPLE 19 

REMOTE CONTROL HARDWARE 

A remote control of the present invention may be constructed from, for example, a 
Micromint (Vernon, CT) RTC-LCD, RTC-V25 or RTC-HC11 or RTC180 or RTC31/52, and 
RTC-SIR, in conjunction with an infrared transmitter and receiver, input keys and a compatible 
trackball, which may provide raw encoder signals, or may employ a serial encoder and have a 
serial interface to the processor module. A power supply, such as a battery, is used. The use, 
interfacing and programming of such devices is known to those skilled in the art, and such 
information is generally available from the manufacturer of the boards and the individual circuit 
elements of the boards. The function of such a remote control is to receive inputs from the 
trackball and keys and to transmit an infrared signal to the controller. 

The processor and display, if present, may provide added functionality by providing a 
local screen, which would be useful for programming feedback and remote control status, as well 
as compressing the data stream from the trackball into a more efficient form. In this case, certain 
of the extracted information may be relevant to the determination of the user level, so that 
information related to the user level would be analyzed and transmitted separately to the 
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controller by the infrared transmitter. If the local LCD screen is used in the programming 
process, then the main controller would transmit relevant information to the remote display, by a 
reverse-channel infrared link. These components are known in the art, and many other types may 
also be used in known manner. 

In known manner, available personal digital assistants ("PDAs"), available from 3Com 
(Palm Pilot III, Vx, VII), Microsoft Windows CE-based devices, BeOS, etc. may also be 
employed as a human interface device. 

EXAMPLE 20 

MEDICAL DEVICE INTERFACE 

The interface and intelligent control of the present invention are applicable to control 
applications in medicine or surgery. This system may also be described with reference to the 
generic system drawings of Figs. 23 and 24. In this case, an operator identifies himself and 
enters information regarding the patient, through the interface 2305. The interface 2305 
automatically loads the profile 2406 of both the operator and the patient, if the device is used for 
more than one at a time, and is connected to a database containing such information, such as a 
hospital central records bureau. The interface may be connected to various sensors, of the input 
device 2401, such as ambient conditions (temperature, humidity, etc.), as well as data from the 
patient, such as electrocardiogram (EKG or ECG), electromyograph (EMG), 
electroencephalogram (EEG), Evoked Potentials, respirator, anesthesia, temperature, catheter 
status, arterial blood gas monitor, transcutaneous blood gas monitor, urinary output, intravenous 
(IV), intraperitoneal (IP), Intramuscular (IM), subcutaneous (SC), intragastric or other types of 
solutions, pharmaceutical and chemotherapy administration data, mental status, movement, 
pacemaker, etc. as well as sensors and data sources separate from the patient such as lab results, 
radiology and medical scanner data, radiotherapy data and renal status, etc. Based on the 
available information, the interface 2405, using the simple input device and the display screen 
described above, presents the most important information to the operator, along with a most 
probable course of action. The user then may either review more parameters, investigate further 
treatment options, input new data, or accept the presented option(s). The system described has a 
large memory in the signal analysis module 2409 for recording available patient data from the 
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signal receiver 2408, and thus assists in medical record keeping and data analysis, as well as 
diagnosis. While various systems are available for assisting in both controlling medical devices 
and for applying artificial intelligence to assist in diagnosis, the present system allows for 
individualization based on both the service provider and the patient. Further, the present 
invention provides the improved interface for interaction with the system. 

It is further noted that, analogously to the library function discussed above, medical 
events may be characterized in the characterization unit 2407 and recorded by the plant 2404, so 
that a recording of the data need not be reviewed in its entirety in order to locate a particular 
significant event, and the nature of this event need not be determined in advance. It is also noted 
that the compression feature of the recorder of the present invention could be advantageously 
employed with the large volume of medical data that is often generated. Medical data image data 
may be compressed as known in the art, by standard image compression techniques, and/or 
image compression techniques optimized for radiology, nuclear medicine and ultrasonography 
data. Other types of data may be compressed using lossless algorithms, or by various vector 
quantization, linear excited models, or fractal compression methods. It is finally noted that, 
because of its ability to store and correlate various types of medical data in the characterization 
unit 2407, the system could be used by the operator to create notes and discharge summaries for 
patients, using the database stored in the local database 2413, as well as the user history and 
preferences 2406. Thus, in addition to saving time and effort during the use of the device, it 
would also perform an additional function, that of synthesizing the data, based on medical 
significance. 

In addition to providing the aforementioned intelligence and ease of use, the present 
example also comprises a control 2402 ; and may interface with any of the sensors and devices, 
performing standard control and alarm functions. However, because the present control 2402 is 
intelligent and has pattern recognition capability, in addition to full data integration from all 
available data sources, it may execute advanced control functions. For example, if the present 
control 2402 is interfaced to a controlled infusion pump for, e.g., morphine solution, in e.g., a 
terminally ill patient, then certain parameters must be maintained, while others may be flexible. 
For example, a maximum flow rate is established as a matter of practice as a safety measure; too 
high a flow rate could result in patient death. However, a patient may not need a continuous 
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infusion of a constant dose of narcotic. Further, as the patient's status changes, the level of 
infusion may be advantageously altered. In particular, if the renal status of the patient were to 
change, the excretion of the drug may be impaired. Therefore, by providing the controller with a 
urinary output monitor, it could immediately suppress the morphine infusion as soon as the renal 
output is recognized as being decreased, and further indicate an alarm condition. Further, it may 
be advantageous to provide a diurnal variation in the infusion rate, to provide a "sleep" period 
and a period of heightened consciousness with correspondingly lower levels of narcosis. Where 
various tests, procedures or interviews are scheduled, an appropriate level of narcosis and/or 
analgesia may also be anticipatorily provided at an appropriate time. 

As another example of the use of the present device as a medical controller, the control 
2402 could be interfaced with a cardiac catheter monitor, as a part of the signal receiver 2408. In 
such a case, normally, alarms are set based on outer ranges of each sensor measurement, and 
possibly a simple formula relating two sensor measurements, to provide a useful clinical index. 
However, by incorporating the advanced interface and pattern recognition function of the present 
invention, as well as its ability to interface with a variety of unrelated sensors, the present device, 
including the present control, may be more easily programmed to execute control and alarm 
functions, may provide a centralized source of patient information, including storage and 
retrieval, if diverse sources of such information are linked, and may execute advanced, adaptive 
control functions. The present control 2402 is equipped to recognize trends in the sensor data 
from the signal receiver 2408, which would allow earlier recognition and correction of various 
abnormal conditions, as well as recognizing improvements in conditions, which could allow a 
reduction in the treatment necessary. Further, by allowing a fine degree of control, parameters 
may be maintained within optimal limits for a greater percentage of the time. In addition, by 
monitoring various sensors, various false alarms may be avoided or reduced. In particular, false 
alarms may occur in prior art devices even when sensors do not indicate a dangerous condition, 
merely as a safety precaution when a particular parameter is out of a specified range. In such a 
case, if a cause of such abnormal condition may be identified, such as patient movement or the 
normal activities of the patient's caretakers, then such condition may be safely ignored, without 
indicating an alarm. Further, even if a sensor parameter does in and of itself indicate a dangerous 
condition, if a cause, other than a health risk, may be identified, then the alarm may be ignored, 
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or at least signaled with a different level of priority. By providing an intelligent and active filter 
for false alarm events, the system may be designed to have a higher level of sensitivity and 
specificity to real health risks, and further to provide a finer level of control based on the sensor 
readings, with fewer false positive readings. 

5 

EXAMPLE 21 

SECURITIES TRADING TERMINAL INTERFACE 

The present invention is also of use in automated securities, debt, variable yield and 
currency trading systems, where many complex functions are available, yet often a particular 
10 user under particular circumstances will use a small subset of the functionality available at a 
given time. Such a situation would benefit from the present interface, which provides adaptive 
yp user levels, prioritized screen information presentation, and pattern recognition and intelligent 

So: 

[f| control. A securities trading system is disclosed in U.S. Patent 5,034,916, for a mouse driven 
J:; Fast Contact Conversational Video System, incorporated herein by reference. The present 
Sj5 system relates primarily to the user terminal, wherein the user must rapidly respond to external 
g events, in order to be successful. In such a case, the advantages of the application of an interface 
r\ according to the present invention are clear and discussed above, and need not be detailed at this 
O point. However, the pattern recognition functions of the present invention may be applied to 
p correspond to the desired actions of the trader, unlike in prior intelligent trading systems, where 
the terminal is not individually and adaptively responsive to the particular user. Thus, the system 
exploits the particular strengths of the user, facilitating his actions, including: providing the 
desired background information and trading histories, in the sequence most preferred by the user; 
following the various securities to determine when a user would execute a particular transaction, 
and notifying the user that such a condition exists; monitoring the success of the user's strategy, 
25 and providing suggestions for optimization to achieve greater gains, lower risk, or other 

parameters which may be defined by the user. Such a system, rather than attempting to provide a 
"level playing field" to all users of like terminals, allows a user to use his own strategy, providing 
intelligent assistance. By enhancing the interface, a user becomes more productive with fewer 
errors and faster training. 
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EXAMPLE 22 

FRACTAL THEORY PATTERN RECOGNITION 

Affine transforms are typically mathematical manipulations of data in two dimensions, 
wherein the manipulation comprises a rotation, scaling and a displacement for each of the two 
coordinates. Schroeder, M., Fractals, Chaos, Power Laws , W.H. Freeman & Co., New York 
(1991). Of course, Affine transforms of higher dimensionality may also be employed. In 
describing an image using Affine transforms, the degree of matching between an image and the 
mathematical description of that image may be related by a number of iterations, and the fewer 
the iterations, the less data used to describe the image. Of particular importance in the field of 
graphics is the speed of "convergence", i.e., that a relatively few iterations are necessary in order 
to describe an image with sufficient precision to be visually useful. Therefore, the Affine 
transform mathematical specifications may be far more compact than the raw image data, and 
these specifications compare favorably to other types of image compression, such discrete cosine 
transformation (DCT) compression schemes, including JPEG, depending on a number of factors. 

Affine transforms may be used to produce a compact visual description of an image, 
therefore, among other reasons, the present invention may apply this type of transform to a 
pattern matching system for analyzing image contents. The related wavelet transforms, all under 
the general schema of multiresolution image analysis, may also be employed. 

Pattern recognition, in this case, may proceed on an image basis, to match similar images, 
or on an object basis, in which portions of images are matched. It is preferred that the pattern 
matching system be robust, i.e., tolerant of various alterations of an image, artifacts, interference 
and configurational changes, while specific enough to allow object differentiation. 

In the case of video images, therefore, it is preferred that various two-dimensional 
projections of three-dimensional objects, in various "poses", be classified the same. This 
therefore requires that, in analyzing a two-dimensional image, the object be extracted from a 
background image and separated from other objects. Further, degrees of freedom may be 
determined, such as through analysis of a sequence of frames to reveal relative motion or change 
of portions of the object with respect to other portions. Finally, the object in the image is be 
compared to three (or higher) dimensional models or exemplars, through various projections. 
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In the case of two dimensional image analysis, the image should be analyzed according to 
a robust starting criteria, so that the similarity of images may be determined by comparison of 
normalized Affine transformation coefficients. 

Fractal analysis, the study of self-similarity, and a superset of Affine transformation 
analysis, allows a compact representation of an image or an object in an image, and due to its 
encompassing of various spatial relationships of object parts, may permit normalized transforms 
to be compared. In other words, assuming that the object is extracted from a background scene, 
and various degrees of freedom are identified, an Affine transformation may be applied, which 
will yield a similar result for an image of the same object in a different "pose", i.e., with different 
exercise of its degrees of freedom. It is noted that this Affine transform is generally not 
optimized for highest global compression ratio, although to achieve a match, a transform with the 
lowest Hausdorf distance from the original, for particular portions of the image, may be 
compared. 

While in general, Affine transformations are described with respect to two-dimensional 
images, these may also be applied to three dimensional images. Thus, if a triangular polygon is 
rotated, scaled and displaced in a two dimensional image, a tetrahedron is rotated, scaled and 
displaced in a three dimensional system. Further, analogies may also be drawn to the time 
dimension (although geometric forms which are rotated, scaled and displaced over time are not 
given trivial geometric names). Because, in a contractive Affine transformation (one in which 
the scaling factor of successive iterations is less than 1), continued iterations are generally less 
significant, objects described with varying level of detail may be compared. Even images that 
are not normalized may still be compared, because at every level of the transform, slight changes 
in rotation, scale and displacement are accounted for. 

According to the present invention, nonlinear self-similarity may also be used. Further, 
in objects having more than two dimensions, linear scaling other than rotation, scaling and 
displacement may be described. 

It is noted that many types of optical computers, especially those including holographic 
elements, employ transformations similar to Affine transformations. Therefore, techniques of the 
present invention may be implemented using optical computers or hybrid optical-electronic 
computers. 
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Thus, according to the present invention, the fractal method employing Affine transforms 
may be used to recognize images. This method proceeds as follows. A plurality of templates are 
stored in a memory device, which represent the images to be recognized. These templates may 
be preprocessed, or processed in parallel with the remainder of the procedure, in a corresponding 
manner. Image data, which may be high contrast line image, greyscale, or having a full color 
map, the greyscale being a unidimensional color map, is stored in the data processor, provided 
for performing the recognition function. 

The image is preprocessed to extract various objects from the background, and to separate 
objects. This preprocessing may be performed in standard manner. The method of U.S. Patent 
No. 5,136,659, incorporated herein by reference, may also be used. As a part of this 
preprocessing, a temporal analysis of the object through a series of image frames, is performed to 
provide four dimensional data (space plus time) about the object, i.e., the two dimensions from 
the image, a third image imputed from differing perspective views of the object, and time. 
Certain objects may be immediately recognized or classified, without further processing. 
Further, certain objects, without full classification or identification, may be "ignored" or 
subjected to a lesser level of final processing. During the classification processing, various 
objects may be selected for different types of processing, for example, people, automobiles, 
buildings, plants, etc. See, e.g., U.S. Patent No. 5,970,173, expressly incorporated herein by 
reference. 

After classification, and temporal analysis, an object for further processing is analyzed for 
degrees of freedom, i.e., joints of a person, moving parts of an object, etc. These degrees of 
freedom may then be corrected, e.g., the object itself altered, to change the image into a standard 
format, or the degree of freedom information processed with the object to allow mathematical 
normalization without actual change of the image. 

The information describing the object image is stored. A plurality of addressable 
domains are generated from the stored image data, each of the domains representing a portion of 
the image information. As noted above, the entire image need not be represented, and therefore 
various objects separately analyzed. Further, only those parts of the image or object necessary 
for the recognition, need be analyzed. While it may be unknown which image components are 
unnecessary, sometimes this may be determined. 
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From the stored image data, a plurality of addressable mapped ranges are created, 
corresponding to different subsets of the stored image data. Creating these addressable mapped 
ranges, which should be uniquely addressable, also entails the step of executing, for each of the 
mapped ranges, a corresponding procedure upon the one of the subsets of the stored image data 
which corresponds to the mapped ranges. Identifiers are then assigned to corresponding ones of 
the mapped ranges, each of the identifiers specifying, for the corresponding mapped range, a 
procedure and a address of the corresponding subset of the stored image data. 

To ensure comparability, the processing treatment of the template and the image data are 
analogous. Of course, template data may be stored in preprocessed form, so that the image data 
need only be processed according to the same rules. The domains are optionally each subjected 
to a transform, which may be a predetermined rotation, an inversion, a predetermined scaling, 
and a displacement. Because of the nature of these linear superposable transforms, the earliest 
iterations will include data about gross morphology, later iterations will include data about 
configuration, and latest iterations will include data about texture. 

In addition, nonlinear alterations, and frequency, Gabor or wavelet transform 
preprocessing may be applied. A warping or other kind of transform may also be applied. These 
types of transforms are generally not included in Affine transform analysis, yet judiciously 
applied, may produce more rapid convergence, greater data storage efficiency, computational 
advantages or pattern matching advantages. 

This transform is used to optimize the procedure, and also to conform the presentation of 
the image data with the template, or vice versa. Each of the domains need not be transformed the 
same way, and in fact it is the transform coefficients which are stored to describe the transformed 
object, so that differences in coefficients relate to differences in objects. 

For each of the domains or transformed domains, as may be the case, the one of the 
mapped ranges which most closely corresponds according to predetermined criteria (which may 
include both local and global considerations), is selected. The image is then represented as a set 
of the identifiers of the selected mapped ranges. 

Finally, from the stored templates, a template is selected which best corresponds to the 
set of identifiers representing the image information. This matching process is optimized for the 
data type, which is a string of iterative transform coefficients, of a contractive transform. 
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It is preferred that, for each domain, a best corresponding one of the mapped ranges be 
selected. By performing analogous operations on a template and an unrecognized object in an 
image, a correspondence between the two may be determined. Thus, libraries of template image 
portions may be provided, with associated transform information, which may increase the 
computational efficiency of the system. 

In selecting the most closely corresponding one of the mapped ranges, for each domain, 
the mapped range is selected which is the most similar, by a method which is appropriate, and 
may be, for example, selecting minimum Hausdorff distance from the domain, selecting the 
highest cross-correlation with the domain, the minimum mean square error with the domain and 
selecting the highest fuzzy correlation with the domain, based on rules which may be 
predetermined. Neural network energy minimization may also yield the best fit, and other 
techniques may also be appropriate. 

In particular, the step of selecting the most closely corresponding one of mapped ranges 
according to the minimum modified Hausdorff distance includes the step of selecting, for each 
domain, the mapped range with the minimum modified Hausdorff distance calculated as 
D[db,mrb] + D[l - db,l - mrb], where D is a distance calculated between a pair of sets of data 
each representative of an image, db is a domain, mrb is a mapped range, 1 - db is the inverse of a 
domain, and 1-mrb is an inverse of a mapped range. 

It is important that the selection criteria be tolerant to variations of the type seen in image 
data, e.g., video, so that like objects have similar transforms. Thus, the selection criteria is not 
particularly directed to optimal data compression, although the two criteria may coincide for 
some types of data. 

In the case where the digital image data consists of a plurality of pixels, each having one 
of a plurality of associated color map values, the method includes a matching of the color map, 
which as stated above, encompasses a simple grey scale, natural color representation, and other 
color types. In such a case, the method is modified to optionally transform the color map values 
of the pixels of each domain by a function including at least one scaling function, for each axis of 
the color map, each of which may be the same or different, and selected to maximize the 
correspondence between the domains and ranges to which they are to be matched. For each of 
the domains, the one of the mapped ranges having color map pixel values is selected which most 
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closely corresponds to the color map pixel values of the domain according to a predetermined 
criteria, wherein the step of representing the image color map information includes the substep of 
representing the image color map information as a set of values each including an identifier of 
the selected mapped range and the scaling functions. The correspondence method may be of any 
5 sort and, because of the added degree of complexity, may be a different method than that chosen 
for non-color images. The method of optimizing the correspondence may be minimizing the 
Hausdorff distance or other "relatedness" measurement between each domain and the selected 
range. The recognition method concludes by selecting a most closely corresponding stored 
template, based on the identifier of the color map mapped range and the scaling functions, which 
10 is the recognized image. 

J| Color information may have less relevance to pattern recognition than, for example, edge 

information, and therefore may be subjected to a lesser degree of analysis. The color information 

Nl may also be analyzed separately, using a different technique. 

XSt. 

H5 EXAMPLE 24 

O IMAGE ANALYSIS 

p Alternatively to the object extraction, the image as a whole may be analyzed. In the case 

of moving images, the aforementioned method is further modified to accommodate time varying 
^ images. These images usually vary by small amounts between frames, and this allows a 
20 statistical improvement of the recognition function by compensating for a movement vector, as 
well as any other transformation of the image. This also allows a minimization of the processing 
necessary because redundant information between successive frames is not subject to the full 
degree of processing. Of course, if the image is substantially changed, then the statistical 
processing ceases, and a new recognition function may be begun, "flushing" the system of the 
25 old values. The basic method is thus modified by storing delayed image data information, i.e., a 
subsequent frame of a moving image. This represents an image of a moving object differing in 
time from the image data in the data processor. 

A plurality of addressable further domains are generated from the stored delayed image 
data, each of the further domains representing a portion of the delayed image information, and 
30 corresponding to a domain. Thus, an analogous transform is conducted so that the further 
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domains each are corresponding to a domain. A plurality of addressable mapped ranges 
corresponding to different subsets of the stored delayed image data are created from the stored 
delayed image data. The further domain and the domain are optionally matched by subjecting a 
further domain to a corresponding transform selected from the group consisting of a rotation, an 
inversion, a scaling, and a displacement, which corresponds to a transform applied to a 
corresponding domain, and a noncorresponding transform selected from the group consisting of a 
rotation, an inversion, a scaling, a translation which does not correspond to a transform applied to 
a corresponding domain. For each of the further domains or transformed further domains, the 
one of the mapped ranges is selected which corresponds best according to predetermined criteria 
or rules. As stated above, these domains may also be subjected to corresponding and 
noncorresponding frequency domain processing transforms, Gabor transforms, and wavelet 
transforms. 

A motion vector is then computed between one of the domain and the further domain, or 
the set of identifiers representing the image information and the set of identifiers representing the 
delayed image information, and the motion vector is stored. The further domain is compensated 
with the motion vector and a difference between the compensated further domain and the domain 
is computed. For each of the delayed domains, the one of the mapped ranges is selected which 
most closely corresponds according to predetermined criteria. The difference between the 
compensated further domain and the domain is represented as a set of difference identifiers of the 
selected mapping ranges and an associated motion vector. 

This method is described with respect to Figs. 27, 28 and 29. Fig. 27 is a basic flow 
diagram of the recognition system of the present invention. Fig. 28 provides a more detailed 
description, including substeps, which are included in the major steps shown in Fig. 27. 
Basically, the image, or a part thereof, is decomposed into a compressed coded version of the 
scene, by a modified fractal-based compression method. In particular, this differs from the prior 
compression algorithms in that only a part, preferably that part containing objects of interest, 
need be fully processed. Thus, if a background is known (identified) or uninteresting, it may be 
ignored. Further, the emphasis is on matching the available templates to produce an image 
recognition, not achieving a high degree of compression. Therefore, the image, or domains 
thereof, may be transformed as required in order to facilitate the matching of the templates. As 
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with respect to single images, the templates are represented in analogous form, having been 
processed similarly, so that a comparison of the relatedness of an object in an image and the 
templates may be performed. In particular, if an oblique view of an object is presented, then 
either the object may be transformed to achieve a predicted front view, or the template 
transformed or specially selected to correspond to the oblique view. Further, once a recognition 
has taken place with a high degree of certainty, the system need only ensure that the scene has 
not changed, and need not continually fully process the data. This has implications where 
multiple recognition processes are occurring simultaneously, either in a single scene or in 
different images, wherein the throughput of the recognition apparatus need not meet that required 
for de novo real time recognition of all aspects of all the objects or images. 

In order to limit processing of portions of images, exclusionary criteria may be applied 
which allow truncation of processing when it is determined that an option is precluded or there 
exists a significantly higher probability alternative. The processing system may use primarily 
exclusionary criteria to select the best predictions, or after preselection, employ a highest 
probability selection system on the remaining choices. 

Fig. 30 shows a flow diagram of a cartoon-like representation of an image recognition 
method of the present invention. It shows initially, an input image 3001, having a degree of 
complexity. A windowing function 3002 isolates the object from the background. A first order 
approximation of the image is generated 3003, here called a mapping region. The first order 
approximation is then subtracted from the initial image to produce a difference 3004. The first 
order error is then subjected, iteratively, to successive transform and subtract operations 3005 
and 3006, until the error is acceptably small, at which point the input image is characterized by a 
series of codes, representing the first order approximation and the successive transforms, which 
are stored 3008. These codes are then compared with stored templates 3009. The comparisons 
are then analyzed to determine which template produces the highest correlation 3010, and the 
match probability is maximized 3011. The recognized image is then indicated as an output 3012. 

This system is shown in Fig. 26, wherein a sensor 2602 provides data, which may be 
image data, to a control 2601. The control 2601 serves to control the plant 2603, which has an 
actuator. The plant 2603 may be a VCR or the like. The control 2601 has associated with it an 
intermediate sensor data storage unit 261 1, which may be, for example a frame buffer or the like. 
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The control 2601 also has associated with it a transform engine 2612, which may perform a 
reversible or irreversible transform on the data or stored data. 

The system also has a template input 2610, which may receive data from the sensor 2602, 
if accompanied by identifying information. Thus, the pattern storage memory 2609 stores a 
pattern, such as an image pattern, along with an identifier. 

The control 2601 also has an input device 2604, an on-screen display interface 2605, and 
a program memory 2606, for inputting instructions from a user, providing feedback to the user, 
and recording the result of the user interaction, respectively. Finally, a characterization network 
2607 characterizes the sensor 2602 data, which may be provided directly from the sensor 2602 or 
preprocessing circuitry, or through the control 2601. A correlator 2608 correlates the output of 
the characterization network with the stored patterns, representing the templates from the 
template input 2610. The system therefore operates to recognize sensor patterns, based on the 
correlator 2608 output to the control 2601, 

When analyzing objects in a sequence of images, a determination is made of the 
complexity of the difference based on a density of representation. In other words, the error 
between the movement and transform compensated delayed image and the image is quantified, to 
determine if the compensation is valid, or whether the scene is significantly changed. When the 
difference has a complexity below a predetermined or adaptive threshold, a template is selected, 
from the stored templates, which most closely corresponds or correlates with both the set of 
identifiers of the image data and the set of identifiers of the delayed image data, thus improving 
recognition accuracy, by allowing a statistical correlation or other technique. The threshold may 
be set based on an error analysis of the system to determine statistical significance or using other 
criteria. The threshold may also be adaptively determined based on the history of use of the 
machine and feedback. For example, if the two images both have a high correlation with one 
template, while a first of the images has a slightly higher correlation with another template, while 
the second image has a much lower correlation with that other template, then the system would 
score the first template as a better match to the first image, based on this differentiation. Thus, 
templates may be particularly selected to best differentiate similar images of objects. 
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EXAMPLE 25 

PATTERN RECOGNITION SYSTEM 

The present system allows for the use of a pattern recognition subsystem for a controller 
which acts in accordance with a detected pattern. In image, audio and multimedia applications, 
different types of image processing may take place. First, various processing algorithms may 
take place in parallel, with an optimum result selected from the results of the various algorithms. 
Further, various processing schemes may be applied in sequence, with differing sequences 
applied to different data streams. These processing schemes may be commutative, i.e. yield 
approximately the same result regardless of the processing order, or may be highly order 
dependent, in which case a processed data stream must include information relating to the 
sequence of processing for interpretation. 

Various exemplars may reside in a fragment library, for comparison with unidentified 
data. In the case of processing path dependent systems, an exemplar may be found in multiple 
forms based on the processing procedure, or in a small subset of corresponding libraries. In 
general, both lossless compression methods and lossy compression methods employed using 
high fidelity parameters to minimize loss may be processed to produce a relatively or almost 
unique result for each unknown data set, while lossy compression or processing methods will be 
particularly procedure sensitive, especially if differing strategies are employed. These differing 
strategies may be used to emphasize different features of the unknown data set in order to 
facilitate comparison. This technique is especially useful when the processing procedures are 
run in parallel, so that the latency penalty for redundant processing is minimized. Techniques 
available for this processing include vectorization, fractal processing, iterated function systems, 
spatial frequency processing (DCT- JPEG, MPEG, etc.), wavelet processing, Gabor transforms, 
neural nets (static or sequence of images), and other known techniques. 

In a preferred embodiment, a spatial frequency or wavelet processing step is performed 
first, on static image data or a sequence of images, with a fractal domain processing step 
performed thereafter. This allows high frequency noise to be initially filtered; with subsequent 
fractal-based correlated noise detection and subtraction, therefore allowing cleanup without loss 
of high frequency detail. Preferably, before the fractal-based processing, which may be 
performed by a digital computer or optical processing apparatus, standard edge detection/object 
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separation, e.g., high frequency filtering, contour mapping, artificial intelligence, etc. may be 
performed. A fractal transform is then performed on the image or a portion thereof, starting in a 
standardized manner, e.g. at a point of lowest complexity, or the epicenter of the largest feature 
for beginning a contractive transform. The processed image may then be matched with one or 
5 more databases to identify all or a portion of the image. Optionally, after a match has been found 
and/or confirmed by an operator, using the human interface system, the method is then optimized 
to minimize the errors and increase the efficiency of later matches. This may be performed by 
modifying the database record, or related records, as well as modifying the preprocessing 
algorithm. In a preferred embodiment, the image is processed piecemeal, on an object-by-object 
10 basis. Therefore, after an object has been processed, it is extracted from the image so that the 
3 remaining information may be processed. Of course, multiple objects may be processed in 
I! parallel. The exemplar database is preferably adaptive, so that new objects may be added as they 
i are identified. 

,| The present technology may also be used with a model-based exemplar database, wherein 

15 an image object is matched, based on a two dimensional projection, or analysis of a sequence of 
=; images, with a multidimensional model of an object. For example, the model may include 
!i volume, as well as multiple degrees of freedom of movement. Further, objects may also include 
; i "morphing" characteristics, which identify expected changes in an appearance of an object. 
,s Other types of characteristics may be included in conjunction with the exemplar in the database. 
20 In a preferred embodiment, a model contained in a database includes a three or more 

dimensional representation of an object. These models include information processed by a 
fractal-based method to encode repetitive, transformed patterns in a plane, space, time, etc., as 
well as to include additional degrees of freedom, to compensate for changes in morphology of 
the object, to allow continuous object identification and tracking. Thus, once an object is 
25 identified, an expected change in that object will not necessitate a reidentification of the object. 
According to one embodiment, a fractal-like processing process is executed by optical elements 
of an optical or optical hybrid computer. Further, in order to temporarily store an optical image, 
optically active biological molecules, such as bacteriorhodopsins, etc. may be used. Liquid 
crystals or other electrophotorefractive active materials may also used. These imagers may be 
30 simple two dimensional images, holograms, or other optical storage methods. A preferred 
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holographic storage method is a volume phase hologram, which will transform an impressed 
image, based on hologram to image correlation. Thus, these models would be somewhat linear 
transform independent, and would likely show some (planar) transform relationship. Thus, an 
optical computer may be advantageous because of its high computational speed as compared to 
digital computers for image analysis, due to inherent parallelism and high inherent speed. 

Because of the present limitations in speed of writing an image to optical recording 
media, especially holographic images, the preferred system includes a plurality of image storage 
elements, which are operated in parallel. It is noted that absolute accuracy of object 
identification is not required for "consumer" applications, and therefore partial match results may 
be considered useful. A plurality of partial results, when taken together, may also increase 
identification reliability. Critical applications generally differ in quantitative aspects rather than 
qualitatively, and therefore many aspects of the present invention may be applied to mission 
critical and other high reliability applications. 

A preferred object identification method proceeds by first classifying an object in an ' 
image, e.g., "car", "person", "house", etc. Then, based on the classification and object separation, 
an optimized preprocessing scheme is implemented, based on the classification. This 
classification preprocessing operates on the raw image data relating only to the object, separated 
from the background. Then, after the optimized preprocessing, a parallel recognition system 
would operate to extract unique features and to identify common features to be excluded from the 
comparison. This step could also identify variable features upon which identification should not 
be made because the distinctions are useless for the purpose. Thus, the object image at this point 
loses its relationship to the entire image, and the data reduction might be substantial, providing a 
compact data representation. The preferred algorithm has a tree structure, wherein the 
identification need only differentiate a few possibilities, and pass the result to another branch of 
the tree for further analysis, if necessary. Since the intermediate calculations may help in later 
computations, these should preferably be retained, in order to avoid duplicative analysis. 
Further, the order of analysis should be predetermined, even if arbitrary, so that once a useful 
intermediate calculation is identified, it may be passed in a regular, predictable manner to the 
next stage processing. Of course, one should not ignore that objects in the entire image may be 
correlated with one another, i.e. if one object is present, it would increase or decrease the 
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likelihood of another object also being present. Further, temporal correlations should also be 
noted. Thus, the object identification need not proceed upon each object independently. 

Based on time sequences of two-dimensional images, a three dimensional image 
representation may be constructed. Alternatively, based on various presumptions about 
extractable "objects" in a single or small group of two dimensional images, a hypothetical three 
dimensional object may be modeled, which may be later modified to reflect the actual image 
when an actual view of hidden surfaces is shown. Therefore, by one means or another a three 
dimensional model is created, having both volume and surface characteristics. Of course, since 
inner structure may never be seen, the model normally emphasizes the surface structure, and is 
thus a so-called two-and-a-half dimensional surface model. Other non-integral dimension 
representations may also be useful, and fractal models may efficiently represent the information 
content of an image model. 

When the source signal is an MPEG 2 encoded datastream, it is advantageous to provide 
an exemplar database that does not require complete expansion of the encoded signal. Thus, the 
motion vector analysis performed by the MPEG 2 encoder may form a part of the pattern 
recognition system. Of course, image sequence description formats other than MPEG 2 may be 
better suited to pattern analysis and recognition tasks. For example, a system may transmit an 
interframe, by any suitable description method, as well as an object decomposed image in, e.g., 
fractal transform codes. The transmitted source material, other than interframes, is then 
transmitted as changes only, e.g. new objects, transforms of existing objects, translations of 
existing objects, etc. 

Color coding may use even more extensive use of fractal compression technology with 
high compression ratios, because absolute accuracy is not necessary; rather photorealism and 
texture are paramount, and need not be authentic. Therefore, backgrounds with significant detail, 
which would require substantial data in a DCT type system, could be simply coded and decoded 
without loss of significant useful information. Important to the use of this method is to 
discriminate between background textures and foreground objects, and to encode each separately, 
optimizing the processing based on the type of object being processed. 
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EXAMPLE 26 

DATA CONTEXT SENSITIVE COMPUTER INTERFACE 
The present example relates to a context sensitive computer interface in which a 
characteristic of the interface is modified based on a linguistic or informational content of a data 
5 object upon which the interface is operating. For example, a number of alternate feature sets 
may be made available based on the type of data which is being operated on by the user. For 
example, differing feature sets would be optimal for each scientific discipline, each type of 
financial or economic field, marketing, retail, distribution, manufacturing, administration, human 
resources, etc. Such an interface will make it possible to provide an extended and extensible 
10 suite of application modules customized for the user in general, and further adaptive to the 
yp particular use to which the user may be making of the apparatus. Thus, complex options 

particularly suited for the data at hand may be made available without inefficient interface 
Jij searching, while inappropriate options are not presented. It is noted that this interface is 
Sj responsive to the data, rather than the programming. Further, the data is analyzed for its 
^15 meaning, rather than its type. 

5=! In a word processing environment, a document or section of a document is analyzed for 

3 z : 
it rz * 

G the presence of particular words or phrases, or for the presence of concepts, interpretable by 
#5 linguistic concepts: This context-sensitive functionality does not require an explicit definition by 
the user, but rather will be present even during an incidental occurrence of a recognized context. 
20 In accordance with other aspects of the present invention, each context related function may have 
various user levels, which are selected based on an imputed user level of the user. Thus, the 
interface program must actually interpret the text or context of the user document in order to 
select the most likely options for use. 

Thus, if a user were to embed a table in a document, the available options would change 
25 to table-type options when the "active" portion of the document is at the table, i.e. within the 
viewable area, etc. Further, and more specifically, if the text and context of the table indicate 
that this is a financial table, financial options would be initially provided, and standard financial 
calculation functions immediately made available or performed, in contemplation of their 
prospective use. Similarly, if the data appears to be scientific, a different set of options would be 
30 initially available, and the standard scientific-type calculation functions be made available or 
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performed. If the table relates to chemical or mechanical-type data, chemical or mechanical 
options might be made available, respectively. Embedded graphics, likewise, would be 
associated with graphics functions appropriate to the type of graphic. It is noted that, due to the 
analysis of the content of the document, software having generic functionality may present as 
special purpose software, based on its actual use. 

Thus, in a like manner, the system could determine the "style" of the document and 
automatically format the data in a predetermined manner to conform with general standards of 
presentations relating to the desired style. This is similar to style sheets of many programs, but 
they are self applying, and will, within the same document, be adaptive as the data changes 
context. Further, since the "styles" would be applied automatically, it would be relatively easy to 
alter them, requiring only a small amount of manual effort. This is so because the "keys" by 
which the system determines style could be stored, thus allowing redeterminations to be easily 
made. This context sensitivity could also assist in spelling and grammar checking, where 
different rules may apply, depending on the context. 

The data object includes information, which might be text, arrays of numbers, arrays of 
formulas, graphics, or other data types. The system relates parts of the object to each other by 
"proximity" which could be linear, in the case of a text document, or otherwise, such as in the 
case of a hypertext document or spreadsheet. Those parts or elements of the object closest to 
each other, by whatever criteria, are presumed to be topically related, regardless of data type. 
Thus, if a paragraph of text is proximate to a table for numbers, then the type of numbers 
presumed to occupy the table would relate to the content of the proximate text. If the text relates 
to finance, i.e. uses financial-related terms, or series of words that often occur in financial 
contexts, the table would be presumed to be a financial table. 

Once the context of the part of the object is determined, the system then acts based upon 
this context. The major act is the presentation of tailored menus. This means that if the context 
is financial, the menus available for use with the numeric table relate to financial tables or 
spreadsheets. Further, the proximate text would be subject to financial oriented spellcheck and 
financial oriented grammar or style check. If a graphics-option is selected proximate to the text 
and table, the menu options would presume a financial graph and present appropriate choices. Of 
course, the options need not be limited to a few types, and may be hybrid and/or adaptive to the 
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style of the user. However, it is noted that the adaptive menus could be linked to a "corporate 
style". Thus, communication styles could be dictated by a set of global rules for an organization. 
Of course, these a priori choices could be overridden. 

An advantage of this system is that it allows a software system to include a wide range of 
5 functionality which remains "buried", or relatively inaccessible, based on the context of usage. 
Thus, feature rich software would be considered more usable, and software could be provided in 
modular fashion. Since the system might allow a user to have potential access to many software 
modules, the system could also be linked to a license manager and per use billing system for 
rarely used modules, while allowing these to remain available on, e.g., a CD ROM. Thus, for 

r J0 example, a full integrated package could employ a single, "standard" interface which would not 

C| require task-switching programs, while avoiding presentation of the full range of features to the 

yh user at each juncture. 

This system provides advantages over traditional systems by providing a non- 

."^ standardized interface with a variable feature set which attains usability by adapting a subset of 

2 15 the available functionality based on the context of the data. 

jj EXAMPLE 27 

O GROUP AWARE ADAPTIVE COMPUTER INTERFACE 

13 

The adaptive interface according to the present invention may be used in group 
20 computing applications. In such a case, the predictive functionality is applied to allow the 

interface to apply rules from one group member to a project, even when that group member has 
not contributed personally to a particular aspect. This is thus a type of intelligent agent 
technology, which, according to the present invention includes the characteristics of abstraction 
and extrapolation, rather than rule based analysis which would fail based on divergent 
25 circumstances. This differs from standard rule-based expert system because the intelligence 
applied is not necessarily "expert", and may be applied in a relative fashion. Further, extracted 
user characteristics need not completely define a solution to a problem, and indeed, the use of 
such a technology in group situations presupposes that a contribution of a number of users is 
desirable, and therefore that the expertise of any given user is limited. 
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In order to ensure data integrity after the application or contingent application of user 
characteristics to a datastream, it is desirable to trace the evolution of data structures. This also 
allows for assistance in the organization and distribution of workgroup responsibilities. Thus, in 
a workgroup situation, the goal is not optimization of individual productivity, but rather 
5 optimization of the group result, including all levels of review after an initial phase is complete. 

Thus, while an individual user may seek various shortcuts to achieve various results, the 
group would benefit by having available all information relating to the path taken to achieve that 
result. Further, the desired result may be modified according to the presumed actions of the 
group, so that the final product is pre-optimized for the group, rather than the individual. Thus, a 
rJO group member may have his "rules" extracted from his actions, i.e. by neural net 
p backpropagation of errors programming or fuzzy rule definition, to be presented for 

consideration by another group member. This strategy will allow "better" drafts by considering 
O the predicted input of a member prior to review by that member. A user may further tailor the 
jU rules for a given project, and "distilled wisdom" from non-group members may also be 
jLl5 employed, as in normal expert (AI) systems. This group analysis is also known as collaborative 

"EST 

U filtering, and the tenets of that filed may be fully applied herein. 

py This rule-extraction technology as applied to workgroups is enhanced by the context 

S sensitivity of the software, where the input of each group member may be weighted by 

considering the context. Again, this technique may be used to increase the efficiency of the 
20 primary author of a section of a project, as well as better defining the scope of responsibility of 
each member, while still respecting the input of other group members. 

According to this workgroup rule extraction technology, points of conflict between group 
members are highlighted for resolution. As an adjunct to this resolution phase of a project, 
videoconferencing may be employed. Further, where a conflict of a similar type had occurred in 
25 the past, data relating to the resolution of that conflict, including recorded videoconference, may 
be retrieved and presented to one or more members of the workgroup. In this way, such conflicts 
may be resolved before it becomes adversarial. Thus, each group member may efficiently 
proceed independently, with only major issues requiring meetings and the like to resolve. 

If a workgroup member disagrees with an imputed rule, either explicitly, by review of the 
30 rules, or implicitly, by a review of the results, the system will allow a review of all decisions 
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influenced by that faulty rule, as well as a proposed correction. This may be addressed by any 
member of the group, but usually by the author of the section or the source of the rule will be the 
relevant reviewing individual. Rules may also be created by the group, rather than from a single 
individual. Such rules are more often explicitly defined, rather than derived from observation. 
5 Such group rules may also be subjected to adaptive forces, especially when overridden 
frequently. 

EXAMPLE 28 

ADAPTIVE INTERFACE VEHICULAR CONTROL SYSTEM 
10 It is noted that, the adaptive user level interface is of use in uncontrolled environments, 

a—™; 

y3 such as in a moving vehicle, especially for use by a driver. An intelligent system of the present 
,k invention would allow the driver of such a vehicle to execute control sequences, which may 
compensate for the limited ability to interact with an interface while driving. Thus, the driver 

SI need not explicitly control all individual elements, because the driver is assisted by an intelligent 

y, 

. 15 interface. Thus, for example, if it begins raining, the interface would predict the windshield 
J5"! wipers should be actuated, the windows and any roof opening closed, and the headlights 
O activated. Thus, the driver could immediately assent to these actions, without individually 
G actuating each control. In such a case, the screen interface, which may be a heads-up display, 
~ would provide a small number of choices, which may be simply selected. Further, under such 
20 conditions, there would likely be a large amount of mechanical jitter from the input device, 
which would be filtered to ease menu selection. Further, this jitter might indicate an unstable 
environment condition, which would cause the interface to present an appropriate display. A 
voice input may also be used. 

25 EXAMPLE 29 

ADAPTIVE INTERFACE VEHICULAR CONTROL SYSTEM 
An integrated electronics system for an automobile is provided having control over 
engine, transmission, traction control, braking, suspension, collision avoidance, climate control, 
and audio systems. Steering and throttle may also be controlled. Based on driver preference and 
30 action patterns, the system may optimize the vehicle systems. For example, the vehicle may 
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anticipate voluntary or roaci conditions based on implicit inputs of the user, thus readying 
vehicular systems prior to the actual encounter with certain conditions. Further, a user interface 
may be simplified, based on probable required functionality, thus limiting required attention by 
the driver in order to activate a particular control. By providing such an interface, controls 
5 normally inaccessible may be made accessible, without increasing mechanical complexity, e.g., 
functions normally controlled by computer may be accessed through a common user interface, 
rather than through dedicated manual controls. 

The automobile control system may also include collision avoidance systems, which may 
include imaging sensors and radar or LIDAR ranging and velocity measurement. According to 
10 the present invention, a heads-up display or simplified graphic user interface in the dashboard or 
yF= near the steering wheel presents predicted options to the driver. An auxiliary interface may also 
IT make certain options available for passengers. 

^ According to another aspect of the present invention, an automobile positioning system is 

P 

Si provided, which may be extraterrestrial, e.g., GPS, or terrestrial, e.g., cellular base station, 
1 15 LORAN, etc. Such a system is described in U.S. Patent No. 5,390,125, incorporated herein by 
J?: reference; see references cited therein. A controller in the automobile is provided with an 

O itinerary for the vehicle travel. Based on position and itinerary, the vehicle may communicate 

W 

p with various services, such as food, fuel and lodging providers, to "negotiate" for business. The 
^ driver may be provided with customized "billboards", directed to his demographics. 
20 Reservations and discounts may all be arranged while en-route. Communication between the 
automobile and the services is preferably provided by CDPD services, which is a cellular based 
832 MHz band digital data transmission system. Therefore, an existing cell phone system or 
CDPD modem system may be employed for telecommunication. Preferably, a simple display is 
provided for presentation of commercial messages to the driver or passenger and for interacting 
25 with the service. 

As a matter of practice, the service may be subsidized by the service providers, thus 
reducing the cost to the consumer. The extent of the subsidy may be determined by the amount 
of data transmitted or by the eventual consummation of the transaction negotiated. 
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Because of the positioning system, any variance from the itinerary may be transmitted to 
the service providers, so that reservations may be cancelled, or substitute services provided in a 
different location or at a different time. 

The telecommunication system may also be used as an emergency system, to contact 

5 emergency services and/or police in the event of accident or distress. The transponder system 

may also be part of an antitheft system. The transponder may also be part of a vehicular 

maintenance and diagnostic system to ensure proper servicing and to help determine the nature of 

problems. Raw or processed data may be transmitted to a centralized station for full analysis and 

diagnosis. Because the vehicle need not be at the repair shop for diagnosis, problems may be 

10 analyzed earlier and based on extensive, objective sensor data. 
O 

{; EXAMPLE 30 

j INTELLIGENT INTERNET APPLIANCE 

O 

\\ A further application of the present technologies is in a so-called "Internet appliance". 

H £ 

f — : 

s 15 These devices typically are electronic devices which have a concrete function (i.e., do more than 
Jr 3 ! merely act as a generic server) and typically employ at least as a secondary interface, a web 
D browser. In addition, these devices provide a TCP/IP network connection and act as a web 
p server, usually for a limited type of data. Therefore, in addition to any real human interface on 
~ the device, a web browser may be used as a virtual interface. 
20 According to the present invention, such an Internet Appliance is provided according to 

the present invention with advanced features, for example adaptivity to the user, to the 
environment, or intelligent algorithms which learn. In fact, a preferred embodiment provides a 
rather generic device which serves as a bridge between the Internet, a public packet switched 
network which employs TCP/IP, and a local area network, for example in a residential, industrial 
25 or office environment. The device may further abstract the interface functions for a variety of 
other devices as nodes on either the Internet or local area network, to provide a common control 
system and interface. 

A preferred embodiment also encompasses certain other features which may be used as 
resources for the networked devices or as usable features of the device. 
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The Internet, or other wide area network, may be connected in any known manner, for 
example, X.25/ISDN D-channel, dial-up over POTS (e.g., v. 34, v. 90. v. 91), ISDN, xDSL, 
ADSL, cable modem, frame relay, Tl line, ATM, or other communications system. Typically, a 
system is provided with either a commonly used access method, such as v. 90 or ISDN, or a 
replaceable communications module with a generic interface. Such systems are well known. 

The local area network is also well known, and may include, for example, as a physical 
layer, 10 Base T f 100 Base T, HomeRun (Cat. 3 twisted pair/telephone twisted pair/power line 
transmission, from Intel Corp., e.g., Intel 21145 device/Tut systems). Universal Serial Bus 
(USB), Firewire (IEEE-1394), optical fiber, or other known computer network. The protocol 
may be, for example, TCP/IP, IPX, ATM, USB, IEEE-1394, or other known or proprietary 
appropriate communications protocol. 

While not required, a particular aspect of a preferred embodiment according to the 
present invention is the ability to interface "dumb" devices as nodes on the LAN with an 
intelligent device, while allowing the user to interact primarily with the intelligent device. This 
scheme therefore reduces redundancy and increases functionality. 

Therefore, in an exemplary embodiment, an intelligent home is established, with most or 
all electrical appliances and electronic devices interfaced with the system, for example through 
the aforementioned Homerun system, using any of the supported physical layers. Each device is 
provided as a relatively simple control, for example, remotely controllable (or where applicable, 
dimmable) lights, control over normal use and peak electrical demand of heavy appliances, as 
well as inter-device communications for consumer electronics. Therefore, the intelligent device 
acts as an external communications and control node for the entire network, and may, for 
example, control telephony functions in addition. 

Exemplary devices to be controlled in a home include household appliances, HVAC, 
alarm systems, consumer electronics, and the like, and/or provide for communications purposes. 
An alarm system embodiment, for example, may employ a video camera input for capture and 
analysis of images, as well as motion or irregularity detection. The intelligent device may, for 
example, employ neural networks or other intelligent analysis technology for analyzing data 
patterns indicative of particular states. An alarm output may be produced, for example, through 
standard alarms, as well as through a telephone interface of the system. 
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The system may therefore set/control/monitor the status of any home-based device - 
oven, stove, alarm, washing machine, dryer, iron, lights, computer, oil/gas burner, thermostat, 
location of automobiles, camera, pump (pool, sump), sprinkler, stereo/video systems, home 
surveillance system. This may be especially important if the user is away from home for an 
5 extended period of time, or if he or she wants to change the schedule of something, or travel 

plans change. For a home surveillance system, pattern recognition may be employed to monitor 
all sensors, including cameras, to detect abnormal patterns or changes in condition. 

Thus, since the intelligent device incorporates a web server, the physical proximity of the 
user is not critical for interaction with the device, and all devices on the LAN may be controlled 
10 remotely, automatically, and in synchrony. 
yP In one embodiment, the intelligent device includes a videoconferencing/video capture 

yh system, including any or all known features for such systems, for example as described in the 
~: background of the invention. Therefore, in addition to a base level of functionality, such an 

embodiment would also likely include (a) telephony interface, (b) video capture, (c) video codec, 
5 15 (d) audio capture, (e) audio codec, (f) full duplex speakerphone, (g) video output, and (h) audio 
n.\ output. 

£? 

~! In another embodiment, a speech interface is provided for interpreting human speech as 

D an input and/or producing synthesized speech as an output. Therefore, such a device would 

include speech recognition and/or synthesis technologies, as well as a semantic data processor. 

20 Preferable, the device allows use of a simplified web browser interface, such as which 

may be supported by personal digital assistants (PDAs) and enhanced digital data cellular 
telephones, e.g., handheld device markup language (HDML). This, for example, allows a remote 
user to communicate through wireless networks or the like, and therefore avoids the need for a 
full personal computer as a human interface. 

25 Advantageously, the device may be interfaced with a telephone communication system, 

allowing use as a voice and/or video message recorder, and allowing remote access to the stored 
information, either through a dialup connection and/or through the network. In this case, the 
intelligent device may act as a computer telephony interface, and all communications devices 
logically under this device act as "net phones", i.e., voice communications devices which 

30 communicate over data networks. Therefore, all telephony control and computer telephony 
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functions may be integrated into the device, for example, voice mail, auto-attendant, call center, 
and the like. Further, the Internet interface allows remote messaging and control over the 
telephony system, as well as virtual networking, Internet telephony, paging functions, and voice 
and data integration. 

5 The intelligent device may also interface with various media electronics devices, and for 

example, may act as a "rights server" or other aspect of a copyright protection and royalty 
collection/enforcement system. Typically, these functions entail e-commeree functions, and may- 
require X.22 and/or XML communications and translations. In addition, such functions also 
typically involve encryption/decryption, as well as key management, which are also preferably 

JO supported by the device. Such support may be in hardware or software. 

U 

ffi Another aspect of the invention provides an index and/or catalog database for media 

yQ information or media metadata information. Thus, data relating to a VCR tape or other recorded 
=a| media may be subjected to search criteria without requiring access or contemporaneous analysis 
of the media content itself. Therefore, a preferred embodiment of the intelligent device includes 
s 15 mass storage and retrieval capability, for example, magnetic disk, RW-CD, or RW-DVD. This 
fy mass storage and retrieval capability may be used, not only for databases, but also for computer 
J software, media and content storage and retrieval. Thus, the device may also serve as a video 
O data recorder, capturing video data and storing it digitally, for example, employing the 

aforementioned video and audio codecs. In this case, it is preferable that the intelligent device 
20 also include a direct media access port, for example a broadcast TV tuner, ATSC/HDTV tuner, 
cable tuner, DVD reader, CD reader, satellite video decoder, NTSC composite/S-VHS, and/or 
other type of media content information input. With such storage, the intelligent device may also 
assume the standard functions of computer network servers, for example, file serving, print 
serving, fax serving, application serving, client/server application support, as well as traditional 
25 networking functions, such as bridging, routing, switching, virtual private network, voice-over- 
IP, firewall functions, remote access serving, and the like. It should also be apparent that the 
intelligent device may also serve as a personal computer itself, and thus does not require 
additional systems for basic functionality. 

In a media recording system embodiment, the system preferably notifies the user if the 
30 "program", i.e., instructions, are incomplete, ambiguous, or impossible to complete. For 
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example, if a single channel selector is provided, no more than one channel may be monitored ai 
a time. Further, where irreversible actions are necessary, the user is preferably informed and 
allowed to make a choice, for example, if lack of storage space forces a choice to be made 
between new and archival material. A conflict management system is provided which arbitrates 
between the conflicting demands, for example if a second user is programming the same device 
(for example, the VCR) to record a show at the same time,. 

Thus, it is apparent that the intelligent device according to this embodiment of the present 
invention may incorporate many different functions, some of which are defined purely by 
software and processing availability, and others by particular hardware devices for performing 
specific functions. 

Another aspect of the invention defines a special training mode of the intelligent device, 
which allows the user to improve the functionality of the system by ensuring that any intelligence 
algorithms will correctly operate in an anticipated and/or desired manner. In this mode, 
responses of the user are provoked which indicate user preferences, preferably in a manner which 
resolves ambiguities encountered with prior data sets. Thus, where the system identifies a 
situation where a decision is difficult, e.g., where the data analysis does not output any selected 
actions which will likely correspond to the user desires or preferences, or where ex post facto the 
user indicates that an inappropriate choice was made, the particular data structures may be stored 
and abstracted for later presentation to the user. In this case, such structures are presented by the 
system to the user, during a training session, to train the system relating to the desired response 
to particular data environments. In this way, the user is not necessarily burdened with training 
tasks during normal use of the device, and opportunities for such training are not lost. Where the 
system is untrained, and an "intelligent" response or mode of operation cannot be resolved, a 
default mode of operation may be defined. Further, such a default mode is preferably always 
available, at the request of the user, thus allowing use where an adaptive system is undesired or 
difficult to employ. 

In a television application, the Internet appliance preferably has access to an electronic 
program guide (EPG). Such EPG systems are known, and typically provide an efficient staring 
point for user programming. These EPG may be provided as an embedded signal in a broadcast 
stream, through a dial-up network, through the internet, or on distribution media, such as CD- 
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ROM, OCR scanning of TV-Guide (or the like) or other known means. EPGs contain a concise 
semantic description of program content, which typically is both sufficient tor user evaluation, 
and brief enough for rapid evaluation. The system may therefore analyze user preferences in this 
semantic space and provide adaptive presentation of elements of the EPG to the user. Of course, 
a media data stream analysis embodiment of the invention, as disclosed above, may be used in 
conjunction with or in lieu of the EPG system. See, U.S. Patent No. 5,867,226, expressly 
incorporated herein by reference. 

The system preferably maintains an updated index of available data. Thus, newly 
acquired data is added to the index, and deleted data is purged from the index. The system 
preferably compares new data to previously encountered data, to avoid redundant processing. 
For example, the system preferably recognizes events/programs that have previously been 
recorded, and checks to determine whether they are still in the index. In this context, the user is 
preferably provided with low-level file maintenance tools, for example to manually control the 
addition or deletion of data, which is then correctly represented in the index. 

Because the Internet appliance is connected to the Internet, so-called multicasts may be 
monitored for correspondence with user preferences. Therefore, it is understood that the 
operation of the present invention is not limited to traditional television broadcasts, and that 
streaming video and audio, as well as stored images, sound files (e.g., MIDI, MP3, A2B, 
RealAudio), text, and multimedia streams may be analyzed based on the adaptive principles 
presented herein. 

The system may also integrate Internet data with other types of data, for example 
providing access to stored or static data corresponding to a data stream. The retrieval and storage 
of such data may also be adaptively controlled in accordance with the present invention. Thus, it 
is expressly understood that the intelligent device may act as a "VCR" (albeit not necessarily 
employing a known type of videocassette tape), to record media. 

The Internet appliance may also operate autonomously, capturing data which corresponds 
to user preferences and profiles, thus reducing latency for the user, and potentially shifting data 
transfers to off-peak periods. Such a system operates in this mode as a so-called "agent" system. 
Likewise, the device may also be linked to other intelligent devices, to provide an intelligent 
interaction therebetween. 
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The preferred user interface maintains user levels constant over long periods, i.e., not 

rapidly adaptive, to allow for quick accessing over a low bandwidth connection, such as a 

telephone, or using succinct displays, such as might be found on a personal digital assistant. 

Thus, the user can rely on memory of the interface functionality and layout to reduce data 
5 transmissions and reduce search time. In one embodiment, the interface may be "forced" to a 

particular type, as either a permanent interface, or as a starting point for adapt ivity. Thus, the 

user may be provided with an interface design mode of operation. 

The user interaction with each "device", which may be real or virtual (implemented as a 

software construct in a relatively general purpose computer), is preferably carefully designed for 
JO each device. A common user interface paradigm is preferably provided for corresponding 
J3 functions, while the user interface is preferably optimized for dealing with the specific functions 
,f; of each particular device. Thus, a similar user interface and screen layout is employed for 

functions that are the same across a variety of devices. In this regard, it is an aspect of an 
Nl embodiment of the invention to translate user interface systems, even in a high level state, to 
= 15 other forms. Thus, in a multi-brand environment, related components may have native interfaces 
p$ that are both well developed and distinctly different. Therefore, the present invention allows for 
y a translation or remapping of the functionality into a common paradigm. Where aspects cannot 

i -a' 

O be adequately translated, the native interface may be presented to the user. 

20 EXAMPLE 31 

SET TOP BOX WITH ELECTRONIC COMMERCE CAPABILITY 
Known systems for accounting and payment for on-line transactions include credit and 
debit card transactions, direct deposit and wire transfer, Micro Payment Transfer Protocol 
(MPTP) (www.w3.org), Millicent (Compaq Computer Corp.), and a number of other systems. 

25 Typically, these seek to be secure, i.e., to ensure to some degree of reliability against the risk of 
non-payment. The following U.S. Patents, expressly incorporated herein by reference, define 
aspects of micropayment and on-line payment systems: 5,930,777; 5,857,023; 5,815,657; 
5,793,868; 5,717,757; 5,666,416; 5.677,955; 5,839,119; 5,915,093; 5,937,394; 5,933,498; and 
5,903,880. See also, Rivest and Shamir, "PayWord and MicroMind Two Simple Micropayment 

30 Schemes" (May 7, 1996), expressly incorporated herein by reference; Micro PAYMENT transfer 



Hoffberg et al. 



-229 - 
s 



LIH-13 



Protocol (MPTP) Version 0.1 (22-Nov-95) et seq, http://www.w3.org/pub/WWW/TR/WD-mptp: 
Common Markup for web Micropayment Systems, http://www.w3.org/TR/WD-Micropayment- 
Markup (09-Jun-99). 

Advantageously, a micropayment scheme is implemented to credit or debit accounts of 
advertisers, users, service providers, and content owners, for example. By facilitating small 
monetary transfers, such as between about $0.05 to $5.00, the relatively small dollar values and 
large audience sizes may be accommodated. This, in turn, will likely make the set top box- 
delivered entertainment industry content efficient, and potentially allows for the close-knit 
integration of e-commerce. For example, instead of being linked to an Internet web site operated 
by a commercial provider by integration of the Internet and streaming broadband media, a 
complete commercial transaction may be completed. For authentication of the user, typical 
means may be employed, such as passwords and the like, or more sophisticated techniques such 
as facial recognition, which may employ common systems as video pattern recognition systems 
within the device and video-conferencing hardware. Thus, making an impulse purchase based on 
an advertisement may be as simple as pushing a single button on a remote control. 

With transactions having a higher economic value, further safeguards may be 
implemented, and for example a written contract or receipt could be generated, executed, and 
returned to the vendor, all using a simple set-top box system with attached printer and scanner 
(or use of a video camera as image input device). 

The payment or micropayment scheme may be integrated with a content 
management/digital watermarking/copy protection scheme, for example where the transaction 
purchases a limited license in an electronic audio-visual work. The system typically 
automatically triggers a monetary transaction to compensate the proprietary rights holder, 
although under certain circumstances the delivery of the work and the compensation for viewing 
may be decoupled. For example, as explained elsewhere herein, the content may be stored in a 
privileged storage medium. Thus, the accounting for use occurs upon substantial viewing, and 
not upon mere downloading to a "buffer". Alternately, the privileged store is encrypted, and the 
decryption key is provided only upon payment. Thus, in the case, the payment transaction may 
be relatively simple, and not require a complete download of a massive audio-visual work. 
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Typically, a pay-per-view work will be downloaded in a push process to multiple set top 
boxes using a common encryption key. Once received by an individual addressable box, the 
work will be re-encrypted based on the identity or identifier of the hardware, using a public key- 
private key system. Thus, using the public key of the identified hardware, a private key- 
transmitted for decrypting the work and accounting transaction may be performed to compensate 
the content provider. This system may also work to subsidize the viewing of content. If a viewer 
is willing to receive certain commercials (which may be stored in mass storage on the hardware 
or streamed using broadband or packet technology), a payment in favor of the viewer may be 
received. If the hardware has viewer sensing technology, the compensation may be based on the 
individuals watching the commercial. If the commercial is time shifted, compensation may be 
arranged depending on the time of viewing and a formula, which for example may account for 
staleness of the commercial. 

The value may also depend on the correspondence of the commercial to one or more user 
preference profiles of the respective viewers. Typically, the compensation model will not be the 
simple aggregate sum of the values for each user. This is because typically, the purchases of the 
group are not uncorrelated, and therefore the aggregate sum of the values would tend to 
overestimate the commercial potential of the group. Likewise, the values for any one individual 
would tend to underestimate the potential of the group. Therefore, a more sophisticated 
demographic and group (typically family or communal group) analysis should be employed. 

In order to register the viewers present, a number of methods may be employed, for 
example video observation, voice verification, fingerprint or retinal scan technologies, voluntary 
identification, or the like. Preferably, little additional hardware is employed and the registration 
process employs hardware otherwise provided for other purposes; however, fingerprint scanners 
and retinal scanners are useful, even if they incur an additional hardware cost. 

The types of content delivered may include images, video, multimedia clips, music, text 
content, templates, software and applets, and any other sort of information. 

The micropayment and rights accounting system may be provided by the system operator, 
i.e., a broadband cable system operator, or by a third party. Thus, a communications system 
outside the cable (or satellite) network may be provided. The hardware system according to the 
present invention may, for example, be integrated with a known cable modem or DSL system, or 
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employ a separate analog POTS modem. By providing such an open communication system, it 
is possible to maximize the flexibility and the value of communications, essentially allowing 
completely customized communications. With an intelligent set top box, having a video storage 
facility, it is possible to create customized presentations by directly addressing the box with a 
5 preformed communication, directing a common presentation to the box which is then customized 
by an individually addressed customization, or by allowing the box to automatically customize 
based on stored data, which need never leave the box. The system therefore supports various 
levels of user privacy. In order to support some functions, user information might be required to 
be transmitted to a cable operator, information aggregator or commercial vendor, for other 
^10 functions, a fully customized presentation may be generated without any outside transmission of 
yp data. The accounting system may also accommodate various levels of privacy. At one end of 
yj the spectrum, a commercial vendor has a complete identification of the viewer, at the other, 
^; neither the commercial vendor nor the transmission system operator has information as to the 
y viewer or any activities thereof. 

ii" '' 

s 15 In practice, some waiver of anonymity may be required for effective auditing. However, 

pTj the Nielsen and Arbitron rating systems are built on a user reporting or observation platform, and 

It! thus user acceptance is not likely to be difficult. On the other hand, direct advertiser feedback of 

D viewer information, except by voluntary action, such as direct contact, contest entry, purchase, 

~ and the like, is likely to be strongly resisted. Thus, an effective proxy filter is preferred to 

20 separate accounting issues from advertiser feedback. 

EXAMPLE 32 

USER INPUT OF PREFERENCES 

The system according to the present invention accommodates at least two different means 
25 of user definition of preferences. In a first mode, a user specifically or explicitly makes choices, 
much as in a questionnaire, to define explicit preferences. Alternately, a demographic profile 
may be obtained, which is then correlated with likely user preference based on collaborative 
filtering principles. These principles may, in turn, be explicitly defined as a set of rules or fuzzy 
rules, or derived from observation of persons with like demographic profiles. Typically, the 
30 questionnaire will be presented as a series of one or more screens, which may be part of a graphic 
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user interface or character mode on screen display interface. The data will typically be stored 
locally in the hardware, and not transmitted, in order to preserve user privacy, but in certain 
circumstances transmission to a server may be acceptable. In order to avoid transmitting the user 
information to the server, the client appliance (e.g., set top box) must filter and select available 
content that meets the user criteria or corresponds to the user preference profile. 

The user preference profile may also be derived implicitly by monitoring of the user's 
activities. These may include not only the selected content, but also the time of viewing, other 
persons with whom viewed, explicit feedback from the user, e.g., a binary like/dislike or a more 
fine-grained or multivariate evaluation. 

Where the system employs content-based analysis of a media stream, it is also possible to 
rate temporal portions of the media stream, much as political analysts rate politician's 
performance during long speeches or debates. Therefore, rather than an analysis of the whole, 
user preference may be applied to particular scenes of a movie, for example. This, in turn, may 
be used to adaptively edit content. Thus, typical movies are edited for a showing time of 1.5 to 2 
hours. Often, longer versions are available with additional scenes deemed non-critical for the 
performance, but otherwise meritorious. Therefore, a longer version of a movie may be streamed 
to a plurality of viewers or potential viewers, along with a scene list and description, which may 
be automatically or manually generated. The client device may then correlate the user 
preferences with individual scenes, potentially selecting longer or shorter sequences, or editing 
out portions entirely. 

Similar technology allows interactive or immersive presentations, in which the user input 
controls the presentation in the manner of an immersive story video game. 

The user profile(s) may also be provided using both explicit and implicit data. Further, 
extrinsic data may be submitted to the system, such as information contained in typical credit 
reports and other private mass archives of person information. This may include income and 
spending data, geographical demographic data, credit card and usage information, and the like. 

EXAMPLE 33 

ELECTRONIC PROGRAM GUIDE AND CONTENT ANALYSIS SYSTEM 
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In seeking to best make decisions relating to the content of a media stream, an electronic 

program guide or EPG is generally useful as a source of human editorial information relating to a 

media stream. This information is generally accurate, and properly parsed into standardized 

fields, making it easily searchable. On the other hand, such EPGs typically define the content of 

5 a "program" as a whole, and must be prepared in advance of the transmission, and thus have little 

detail relating to live or near live broadcasts, such as sports, television news, talk shows, news 

feeds, and the like. On the other hand, automated content analysis, while available for real time 

or near real time media streams, are limited by the reliability of the algorithms employed, which 

are typically substantially less than 100%. Content analysis algorithms also provide the ability to 

10 characterize individual scenes or even frames of a media stream, which may represent totally 

*g distinct concepts than those indicated in an EPG describing the program as a whole. Therefore, 

% the present invention also provides a system that employs both EPGs and content analysis of 

^ media streams seeking to best characterize a media stream for action thereon. In such as system, 
w 

St the EPG is mostly relied upon for defining candidate programs, while the content analysis 
M ; 

Ll5 subsystem is relied upon for filtering the programs. The criteria used by each system may differ 
markedly, or be defined by a unified user preference profile or artificial agent scheme. 

C For example, in a business setting, an intelligent agent may be provided to screen 

ni 

p broadcasts for news reports relating to certain stocks or companies. In this case, the EPG first 
^ defines news reports being broadcast. After determining which broadcasts are news, the content 
20 filter then analyses the content, for example by OCR of screen alphanumeric characters, speech 
recognition, and monitoring of closed caption text, if available. News stories that meet the 
desired characteristics are then stored for later viewing or immediately presented, for example. 
After defining stories of potential interest, the content may then be analyzed for significant core 
concepts, which may then be used to filter other stories that might be related. Thus, an 
25 intelligent and iterative process may be defined to filter and present information which meets 
certain criteria, which may be explicitly defined, such as by stock ticker symbol, or implicitly 
defined, such as by an indication of "track similar stories" by tlie user. 

It should be understood that the preferred embodiments and examples described herein 
are for illustrative purposes only and are not to be construed as limiting the scope of the present 
30 invention, which is properly delineated only in the appended claims. 
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